nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
388 stars 401 forks source link

RFE: Variant decomposition #1609

Open lbeltrame opened 1 month ago

lbeltrame commented 1 month ago

Description of feature

This time I tried to look through issues and code. ;)

As far as I can see, sarek does not do any kind of variant decomposition, which (if coupled with normalization) can simplify processing of the produced VCF file by downstream tools, prior to annotation. "bcbio" back in the day did this, although if this is implemented in sarek, it should be definitely optional and turned off by default.

FriederikeHanssen commented 1 month ago

Hey! Yes I think this would be great to have. We discussed it back and forth a bunch but I think it definitely makes sense also simplify merging of multiple callers

lbeltrame commented 1 month ago

I used to use vt back in the days for this kind of job, but I'm not aware if there's anything newer for this.

EDIT: Apparently bcftools norm may do the job.