mskcc / tempo

CCS research pipeline to process WES and WGS TN pairs
https://cmotempo.netlify.com/
12 stars 5 forks source link

Zygosity annotation of mutations should be done with hisens rather than purity #967

Open anoronh4 opened 2 years ago

anoronh4 commented 2 years ago

We are currently using the purity facets result to annotate the maf, but I believe we should be using the hisens result instead. looks like the clinical impact runs are run using hisens, and for annotating point mutations i think it makes sense to use a result with less smoothing.

somatic annotation: https://github.com/mskcc/tempo/blob/41a0bf5b1182eb7c8b2789221086c0f9254711cd/pipeline.nf#L1838-L1844

germline annotation: https://github.com/mskcc/tempo/blob/41a0bf5b1182eb7c8b2789221086c0f9254711cd/pipeline.nf#L2335-L2340

Here is an example annotation for a variant (select columns):

Hugo_Symbol t_var_freq  n_var_freq  tcn lcn cf  purity  expected_alt_copies allelic_imbalance   loss_of_heterozygosity  zygosity_flag
BRCA2   0.69811320754717    0.541666666666667   2   1   1   0.521339308396323   3   ALT_GAIN    TRUE    AI_LOH_ALT

We found that this led to tcn and lcn annotations in the maf that we didn't expect, for example in the following variant, tcn/lcn was 2/1, but the zygosity flag said that there was LOH. Taken together, purity and t_var_freq would suggest that it is in fact LOH. The hisens copy number says that the segment would be 1/0, and would be harmonious with t_var_freq (if i understand it correctly).

anoronh4 commented 2 years ago

Should be noted in release notes for v2.