Proposed Analysis: apply TP53 & NF1 classifiers to PBTA data

AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project

Other

101 stars 67 forks source link

Proposed Analysis: apply TP53 & NF1 classifiers to PBTA data #165

Closed jaclyn-taroni closed 4 years ago

jaclyn-taroni commented 5 years ago

Scientific goals

Identify samples that have TP53 or NF1 inactivation using gene expression data.

Proposed methods

This would use a pre-existing classifier (trained on TCGA data) that was described in Knijnenburg et al.
It seems appropriate to keep the poly-A and stranded samples separate for this.

Required input data

Gene expression data. I believe relative abundance data at the gene-level is what was used to train the classifier. In this cohort that would correspond to:

pbta-gene-expression-rsem-fpkm.polya.rds pbta-gene-expression-rsem-fpkm.stranded.rds

Proposed timeline

I think this could be accomplished in 2 weeks.

Relevant literature

Initially described in Knijnenburg et al. Cell Reports. 2018.
Applied in Rokita et al. bioRxiv. 2019.

cgreene commented 5 years ago

From a conversation with @gwaygenomics and @PichaiRaman I have some additional info that may be helpful:

Hi Pichai and Casey,

Probably the easiest way to apply the TP53 classifier is by swapping in PBTA data in this script: https://github.com/marislab/pdx-classification/blob/master/1.apply-classifier.ipynb

This is the analysis done for the PPTC PDX paper.

Thanks! Greg

jharenza commented 5 years ago

Updated to add NF1 since that classifier also works well in pediatric data and both can be accomplished at the same time.

jaclyn-taroni commented 5 years ago

Relevant pub for NF1: Way et al. BMC Genomics. 2017.

jharenza commented 5 years ago

@kgaonkar6 will work on this!

gwaybio commented 4 years ago

perhaps important to note distinction here:

Relevant pub for NF1: Way et al. BMC Genomics. 2017.

This demonstrated proof of concept for NF1 loss classification in Glioblastoma specifically.

Initially described in Knijnenburg et al. Cell Reports. 2018.

This trained an TP53 alteration classifier using pancancer data (so many more cancer types than just glioblastoma)

Applied in Rokita et al. bioRxiv. 2019.

This used the Knijnenburg et al classifier for the TP53 analysis. But the NF1 and Ras coefficients were built in the original PanCancer classifier paper. The current PBTA analysis classifier also uses the Knijnenburg coefficients for the TP53 analysis and the original PanCan classifier coefficients in the NF1 analysis.

Edit:

Specified references for TP53 and NF1 classifiers 👀

jaclyn-taroni commented 4 years ago

Once #128 has been completed, we will want to include CNV data when evaluating the results of these classifiers. I will also note that currently there is a very low number of NF1 alterations in the poly-A data so the AUROC results may be a bit misleading (see discussion on #385) and we may want to add confidence intervals to the plot (@cgreene found this package - https://rdrr.io/cran/pROC/man/ci.auc.html).

jaclyn-taroni commented 4 years ago

Addressed via the linked pull requests. Closing in favor of filing an updated analysis ticket if needed.

kgaonkar6 commented 4 years ago

Hi @jaclyn-taroni Do we need an updated analysis ticket for this analysis which includes CNV data ?

jaclyn-taroni commented 4 years ago

Good idea @kgaonkar6