varfish-org / mehari

VEP-like tool for sequence ontology and HGVS annotation of VCF files
MIT License
14 stars 1 forks source link

Implement multiallelic variants/variant normalization #447

Open holtgrewe opened 1 month ago

holtgrewe commented 1 month ago

Is your feature request related to a problem? Please describe. Currently, mehari does not support multi-allelic sites. This is a big limitation and requires bcftools norm -m -any --force.

Describe the solution you'd like Allow mehari to process multi-allelic sites. Mehari will also need to normalize the sites for precise predictions for which it will need to be given the FASTA reference. When writing out the split/normalized records, mehari will need to sort the variants again so the resulting VCF file is sorted.

Describe alternatives you've considered N/A

Additional context N/A

tedil commented 1 month ago

But why should normalization functionality be replicated in mehari? Makes much more sense to stick to bcftools norm, since that already exists and is (in some way or another) tried and tested and there's no need to maintain even more functionality. When encountering multi-allelic sites, simply bail and remind the user to normalize.

holtgrewe commented 1 month ago

VEP can handle it so should we...