Open DarioS opened 2 years ago
Hi Dario,
Thanks for your questions.
To answer your question regarding the Pan Cancer RNA-seq datasets:I have created SummarizedExperimentall objects for all the TCGA RNA-seq studies including SKCM. I have collected many possible sample annotations and batch details for each cancer type which can help TCGA users to better understand the data, particularly different sources of unwanted variation. However, that was almost impossible to accurately identify "gene expression based" subtypes for all cancer types as this requires careful analysis and prior knowledge about each cancer type. We are currently working on some other TCGA cancer types including SKCM to find major biological subtypes in oder to be able to use RUV-III-PRPS.
For all TCGA BRCA, LUAD, COAD and READ RNA-seq studies, we either found the cancer subtypes by ourself or contacted TCGA research network to provide us those details.
I notice that there are more TCGA projects which have subtype information. For example, from Genomic Classification of Cutaneous Melanoma, Cell, 2015 has
and this is reflected in Biocondctor's curatedTCGAData package.
Could the preprocessed data provided be more comprehensive or is there something special that I am overlooking which means that a data set such as melanoma can't actually be processed using the PRPS method?