AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
99 stars 66 forks source link

Revision: update telomerase analysis to assess whether samples with TERTp have high scores #1651

Closed jharenza closed 1 year ago

jharenza commented 1 year ago

What analysis module should be updated and why?

https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/telomerase-activity-prediction

Reviewer commented that we did not assess whether samples with TERT promoter (TERTp) mutations have high telomerase scores

What changes need to be made? Please provide enough detail for another participant to make the update.

Plot telomerase scores and highlight tumors with TERT promoter mutations C228T and C250T?

What input data should be used? Which data were used in the version being updated?

SNV consensus MAF + hotspot MAF (to ensure we capture all TERTp) telomerase scores

When do you expect the revised analysis will be completed?

this week?

Who will complete the updated analysis?

?

sjspielman commented 1 year ago

Hi @jharenza I'm going to start having a look here. I'm wondering if you have a reference for the TERTp mutations C228T and C250T you listed? I found this paper which does specifically list those two (https://aacrjournals.org/mcr/article/14/4/315/132821/Understanding-TERT-Promoter-Mutations-A-Common), but I wonder if we should be looking more generally for any mutations upstream of TERT within a given bp range? That said I have yet to actually look at the data so I don't know how much of this is actually in our WGS data!

jharenza commented 1 year ago

I don't think so - we may not be capturing them, if so (eg - could be a situation of 1 or 2/4 callers) - these are the two hotspot mutations (more in the README here)

sjspielman commented 1 year ago

@jharenza Ok, I've poked around a bit. The data here is pretty thin (too thin?) I think. To just get something started, I looked at just any tumors with a mutation in the TERT 5' flank (which notably contains more than just promoter!). There are only 15 samples (pooled from both the hotspots and consensus MAF files), and only 12 of those have corresponding stranded RNA samples. I didn't dive into genomic coordinates to check if any of these are those two hotspot mutations. A quick wilcox test between samples with and without mutations shows P=0.119.

Black horizontal line below is the global median.

Screenshot 2023-01-19 at 4 20 07 PM

I can go ahead and file this as a PR but wanted to check first in case this strategy is just not at all the right way to go!

jharenza commented 1 year ago

Looks like a good strategy but I think we Reallt want to know what the scores are for those with hotspot mutations because those are known to increase telomerase activity