mskcc / tempo

CCS research pipeline to process WES and WGS TN pairs
https://cmotempo.netlify.com/
12 stars 5 forks source link

Rescue mutations that are oncogenic at 2% frequency #922

Open anoronh4 opened 3 years ago

anoronh4 commented 3 years ago

Currently we detect mutations at frequencies as low as 5% and rescue hotspot mutations as low as 2%. We should also be rescuing mutations that are Likely Oncogenic, Predicted Oncogenic and Oncogenic. Here is a not so random sample of calls where Hotspot==TRUE with the corresponding oncokb oncogenic label and the final Filter that was applied.

for i in /juno/work/tempo/wes_repo/Results/v1.4.x/somatic/s_C_M*/combined_mutations/*somatic.unfiltered.maf ; do cut -f 109,209,233 $i | grep "TRUE$" ; done | sort | uniq -c 
      1 high_n_alt_count    Likely Oncogenic    TRUE
      2 high_n_alt_count    Oncogenic   TRUE
      1 low_n_depth ""  TRUE
      1 low_vaf;high_n_alt_count;high_gnomad_pop_af;PoN Predicted Oncogenic TRUE
      2 low_vaf Likely Neutral  TRUE
      1 low_vaf Likely Oncogenic    TRUE
      1 low_vaf Predicted Oncogenic TRUE
      2 low_vaf ""  TRUE
      1 multiallelic2;low_vaf   Likely Oncogenic    TRUE
      1 multiallelic2;low_vaf;low_t_alt_count   ""  TRUE
      2 multiallelic2;low_vaf   Predicted Oncogenic TRUE
      5 multiallelic2;low_vaf   ""  TRUE
      1 part_of_mnv Oncogenic   TRUE
      2 PASS    Likely Neutral  TRUE
    104 PASS    Likely Oncogenic    TRUE
     80 PASS    Oncogenic   TRUE
     14 PASS    Predicted Oncogenic TRUE
      8 PASS    ""  TRUE
      1 PASS        TRUE

it may be worthy to note that of the Hotspot mutations, about 5% did not have one of the aforementioned oncogenic labels.

anoronh4 commented 2 years ago

Update: Because oncokb is updated continuously and not versioned, and Tempo's oncokb annotation is not triggered by updates to the oncokb database, we are considering to skip this annotation in Tempo altogether, or shift it further downstream. the most current version of oncokb can be annotated outside of Tempo either by a live service or by an analyst. however we will have to make sure all mutations >=2% are available for annotation downstream.