Open tdelhomme opened 6 years ago
@NoemieL, you should have an R script computing this on a VCF file, you just should adapt this to MAF file. Be carefull, you have one MAF per cohort.
@aurelieGabriel would give a presentation on TCGA data and so also on MAF format.
Note that Williams et al. selected only samples with a tumor purity >70%. Estimates of tumor purity for most TCGA cohorts are available in COSMIC under "ASCAT Ploidy and Purity Estimates".
One difficulty is that you have to filter on allelic fraction (>0.12 & <0.25 for example), and MAF files don't always contain this information. VCF do, but are protected and are not filtered with the same QC. Maybe check in which cohort the allelic fraction is available in the MAF file with @aurelieGabriel?
In every MAF file the following columns are now reported: t_depth (Read depth across the locus in tumor BAM) and t_alt_count (Read depth supporting the variant allele in tumor BAM).
Tumor purity are also available in the biospecimen/slides tables available on TCGA data portal like here for LUAD for example
To run a R script in your docker container:
Test neutral tumor evolution model described in Williams et al. on TCGA data.
Data: public MAF files
Script: R script, which output regression coefficient and slope of the model, by sample.
Guidelines: Loop on files
Project source code and documentation is hosted here.