nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
388 stars 401 forks source link

Mutect2 multi-sample mode #916

Closed berguner closed 1 year ago

berguner commented 1 year ago

Description of feature

Hi, It would be great if we could run Mutect2 in multi-sample mode for increased sensitivity and better concordance among samples. ~~For reference, there is a WDL implementation here: https://github.com/broadinstitute/gatk/tree/master/scripts/mutect2_wdl#mutect2_multi_sample~~

It turns out that the linked WDL above was performing tumor/normal somatic variant calling for multiple pairs separately. What I mean is better described in discussions linked below: https://gatk.broadinstitute.org/hc/en-us/community/posts/360071839192-Mutect2-multi-sample-pipeline https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-

This comment summarises the steps: https://gatk.broadinstitute.org/hc/en-us/community/posts/360071839192/comments/360012234432

It seems like GATK team hasn't implemented/provided a Mutect2 pipeline performing multi-sample variant calling yet. Here is a quick and dirty implementation that I did for a project back in the day: https://github.com/berguner/variant_calling_pipeline/blob/master/mutect2_multiple_tumor.wdl

berguner commented 1 year ago

FYI, I am working on this issue here: https://github.com/berguner/nf-core-sarek/tree/mutect2_multi_sample .