dbcls / RefEx

RefEx: a reference gene expression dataset of mammalian tissues and cell lines measured by different methods
6 stars 0 forks source link

Columns with all values `-1` #2

Closed yuifu closed 4 years ago

yuifu commented 4 years ago

Thank you for creating RefEx.

I found some of the TSV files in Download page have odd columns who values are all -1.

For examples, here is the first 10 lines of RefEx_expression_CAGE40_human_PRJDB3010.tsv:

NCBI_GeneID 40_1_cerebrum   40_2_cerebellum 40_3_brain stem 40_4_corpus callosum/glia   40_5_pineal gland   40_6_peripheral nerve   40_7_spine  40_8_retina 40_9_eye    40_10_artery/aorta  40_11_vein  40_12_lymphnode 40_13_peripheral blood  40_14_spleen    40_15_thymus    40_16_bone marrow   40_17_adipose   40_18_bone  40_19_skin  40_20_uterus    40_21_placenta  40_22_prostate  40_23_ovary 40_24_testis    40_25_heart 40_26_muscle    40_27_esophagus 40_28_stomach   40_29_intestine 40_30_colon 40_31_liver/hepato  40_32_lung  40_33_bladder   40_34_kidney    40_35_pituitary 40_36_thyroid/parathyroid   40_37_adrenal gland 40_38_pancreas  40_39_breast    40_40_salivary
2   4.443389118 -1  5.097289944 5.451304945 4.696355151 -1  4.936152377 -1  -1  6.521268046 5.180200223 4.663649395 -1  5.943971654 3.467835748 -1  6.577300786 -1  -1  6.185059643 5.664025805 6.627555272 5.909992243 4.885222738 6.018496328 4.724779942 5.908363951 -1  5.328501761 5.282829038 7.492403515 7.061089856 6.413660625 5.861840296 5.260498473 5.43781331  -1  4.765975143 5.611666602 4.897393834
9   0.423097607 -1  0.771479757 0.474126492 1.682075096 -1  0.943956772 -1  -1  1.478694812 1.285523109 1.590003297 -1  1.934558365 1.850166657 -1  1.250084828 -1  -1  1.415139075 1.495012532 1.578186959 1.157226899 1.410244227 0.959507995 0.549580613 0.974111721 -1  2.66382864  1.882009443 2.296535456 2.136249613 0.525026614 1.966671024 1.560015979 0.875567683 -1  1.207329774 1.677096085 0.828849623
10  0.063356386 -1  0.120247641 0   0.471673668 -1  0.089620269 -1  -1  0   0   0   -1  0.612942727 0.062588724 -1  0.256977751 -1  -1  0.36180116  0.163312644 0.145818508 0.140127664 0.306096571 0.124261128 0.048863994 0.092569169 -1  3.766170637 1.902038381 4.57914823  0   0   0.488853388 0   0   -1  0   0   0
12  4.94688426  -1  5.489803078 5.962612078 6.033539123 -1  7.382914852 -1  -1  5.29156464  2.960929379 5.556894361 -1  5.743281253 0.771311376 -1  3.956573007 -1  -1  3.575973207 4.203286572 5.811983968 5.769392466 4.371094956 5.654247007 5.250409056 4.911291699 -1  2.910402332 3.210949729 10.57717722 4.607325348 5.306976421 7.705586908 6.261853618 4.886064638 -1  8.535843549 6.929320472 6.301221679
13  0.072640233 -1  0   0.592671158 0   -1  0   -1  -1  0.181694684 0.472389145 0.679992939 -1  0   0.062588724 -1  2.325326075 -1  -1  0   1.514708421 0.331075891 0.072516297 0.400734094 0.041049883 0.048863994 0.177290617 -1  5.247833542 2.086845969 5.309337505 1.391661901 0.734806007 0   0   0.119820351 -1  3.90104029  1.8041815   0
14  4.363795318 -1  4.149819799 4.058485168 4.238447608 -1  4.270811688 -1  -1  4.250479728 4.143012948 3.941457214 -1  3.795816192 3.757789972 -1  3.853335631 -1  -1  4.046717186 4.222978097 3.998792991 4.051488497 4.369234453 4.218648722 4.376336182 4.080810216 -1  4.187019345 4.436293173 4.107479772 3.999722312 4.080914943 4.100006795 4.178577273 4.239267771 -1  4.630806952 4.26724632  4.5099527
15  0.008555979 -1  0.09985642  0   6.39158625  -1  0.165592332 -1  -1  0   0   0   -1  0   0   -1  0   -1  -1  0   0   0   0   0   0   0.249607896 0   -1  0   0   0   0   0   0   0.339729693 0   -1  0   0   0
16  4.878803989 -1  4.909772775 4.678686341 4.782802323 -1  4.780129691 -1  -1  4.01807436  3.617095625 3.454471219 -1  3.595445782 3.416169501 -1  3.792482901 -1  -1  3.863987804 3.632612856 3.791620756 4.058009159 3.684508286 3.916117873 3.780020756 3.586570499 -1  3.497333754 3.80829412  4.203923744 3.806329261 3.93206495  3.702793825 4.362811317 4.53637758  -1  4.543296562 4.345426904 4.345754722
18  5.025563897 -1  4.156379523 4.269719619 2.871459399 -1  4.385515128 -1  -1  2.215846499 2.523534738 2.337274624 -1  1.723696876 2.233446072 -1  0.775678528 -1  -1  1.782264132 2.447168615 2.571690913 1.733960914 3.274290211 1.905571236 0.655305228 2.089416402 -1  3.370442376 3.091713221 5.866105139 1.831249094 1.860256911 3.619782052 4.946636652 3.356427112 -1  4.885029394 1.8041815   2.464336866

Hope this will be fixed in the future.

hiromasaono commented 4 years ago

Thank you for your comment, and sorry for the ambiguous description in the data. A value of "-1" means not applicable(N/A) in each dataset, meaning that the dataset does not contain that tissue sample(s).

yuifu commented 4 years ago

Thank you for a quick reply! Now I understand. It would be great if such statement is included in the documentation.

hiromasaono commented 4 years ago

Thanks a lot for the proper advice. We have added the statement to the "Download page (https://refex.dbcls.jp/download.php)".

yuifu commented 4 years ago

Thanks a lot!!