cognoma / cancer-data

TCGA data acquisition and processing for Project Cognoma
Other
20 stars 28 forks source link

Generate comprehensive comparisons between RNAseq, Mutation, and Clinical Matrix #17

Closed gwaybio closed 7 years ago

gwaybio commented 7 years ago

We need to generate a comparison between the sample IDs that exist in all three data sources. It will be good to subset the clinical matrix to only samples that are measured by RNAseq and to file a pull request with this report.

gwaybio commented 7 years ago

@mike19106

dhimmel commented 7 years ago

You should be able to use the sample_ids from data/subset/expression-matrix-all-samples.tsv, which are the intersect of expression and mutation samples.

Also I usually wait till the last moment to drop samples, meaning we probably could process the clinical matrix without this information?