mskcc / tempo

CCS research pipeline to process WES and WGS TN pairs
12 stars 5 forks source link

Mutect2 column is enforced in germline calls even though mutect is not used #916

Open anoronh4 opened 3 years ago

anoronh4 commented 3 years ago

We use HaplotypeCaller and Strelka2 to call germline mutations, unlike somatic mutations where we use MuTect2 and Strelka2. However, the MuTect2 column is created when vcf2maf.pl is called (empty column is generated with --retain-info parameter). The HaplotypeCaller field is not retained. The MuTect2 column is then used later even though it contains no useful information: https://github.com/mskcc/tempo/blob/499ee2a1cbbcab28279902546710f7850d2464ee/containers/vcf2maf/filter-germline-maf.R#L57

We might consider to retain the HaplotypeCaller column in the germline maf instead, and adjust the R script to use the HaplotypeCaller column instead.

anoronh4 commented 3 years ago

We should also consider separating custom scripts such as filter-germline-maf.R from the containers to make updates easier. it does not make sense to reinstall a container when small changes are made to a custom script.