IARCbioinfo / SBG-CGC_course2018

IARC course on analyzing TCGA data in the SevenBridges Genomics CancerGenomicsCloud (SBG-CGC)
GNU General Public License v3.0
6 stars 4 forks source link

Project 2: Neutral tumor evolution testing #2

Open tdelhomme opened 6 years ago

tdelhomme commented 6 years ago

Test neutral tumor evolution model described in Williams et al. on TCGA data.

Data: public MAF files

Script: R script, which output regression coefficient and slope of the model, by sample.

Guidelines: Loop on files

Project source code and documentation is hosted here.

tdelhomme commented 6 years ago

@NoemieL, you should have an R script computing this on a VCF file, you just should adapt this to MAF file. Be carefull, you have one MAF per cohort.
@aurelieGabriel would give a presentation on TCGA data and so also on MAF format.

mfoll commented 6 years ago

Note that Williams et al. selected only samples with a tumor purity >70%. Estimates of tumor purity for most TCGA cohorts are available in COSMIC under "ASCAT Ploidy and Purity Estimates".

mfoll commented 6 years ago

One difficulty is that you have to filter on allelic fraction (>0.12 & <0.25 for example), and MAF files don't always contain this information. VCF do, but are protected and are not filtered with the same QC. Maybe check in which cohort the allelic fraction is available in the MAF file with @aurelieGabriel?

aurelieGabriel commented 6 years ago

In every MAF file the following columns are now reported: t_depth (Read depth across the locus in tumor BAM) and t_alt_count (Read depth supporting the variant allele in tumor BAM).

mfoll commented 6 years ago

Good news! (It was not the case before).

mfoll commented 6 years ago

Tumor purity are also available in the biospecimen/slides tables available on TCGA data portal like here for LUAD for example

mfoll commented 6 years ago

To run a R script in your docker container: