mskcc / RNAseqDB

139 stars 41 forks source link

extraordinary expression value after batch correction #6

Open bzip2 opened 5 years ago

bzip2 commented 5 years ago

Hello. I believe that you have an error in this file: data/normalized/lung-rsem-fpkm-gtex.txt.gz

For MAGEA4, in sample GTEX-12KS4-0726-SM-5FQSX, the expression value is 481919091779.97.

Also, it would be good for you to correct these help pages.

unnormalized: the gene expression levels calculated from fpkm of RSEM’s output. The data matrices here, however, were not the direct output of RSEM. They underwent quantile normalization, but were not corrected for batch effects.

normalized: the normalized gene expression levels (FPKM). This set of data files was not only quantile normalized, but also was corrected for batch effects (using tool ComBat).

It appears that you're labeling the quantile-normalized data as "unnormalized" and the normalized+batch-corrected data as "normalized".

Thanks for making your data and code public.

bzip2 commented 5 years ago

There are additional cases like this, but I haven't checked the genes. Please take a look at the maximum values in cervix and thyroid data from GTEx, in the files labeled "normalized".