mskcc / RNAseqDB

139 stars 41 forks source link

Duplicated sample IDs with different expression of genes in TCGA BRCA data #10

Open mxdeluca opened 4 years ago

mxdeluca commented 4 years ago

Hi, after downloading the data you provided on figshare (brca-rsem-count-tcga.txt.gz and brca-rsem-count-tcga-t.txt.gz), I noticed duplicated samples in the data, but with a completely different expression pattern of gene expression (they are not even close together after doing a PCA/tSNE visualization...) Any idea of what may be happening there?

snijesh commented 2 years ago

Check out this: