samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
649 stars 240 forks source link

bcftools norm <DEL> for very long variants #2029

Closed christopher-schroeder closed 10 months ago

christopher-schroeder commented 10 months ago

Hi, it is great that 1.18 supports the normalization of symbolic <DEL> notation. But some of the called deletions in my data are of length bigger than 20.000.000 bp (they are probably false positives, but these things might happen with e.g. with radiotherapy). Instead of the symbolic representation bcftools norm transforms them to explicit representation, so it write the complete 20.000.000 bp to the records ref field. While technically correct, I would HIGHLY prefer the previous <DEL> instead. Maybe an optional flag or threshold when to do the explicit / symbolic representation would be nice

pd3 commented 10 months ago

This was actually not intended, thank you for reporting the problem. I believe the ALT column behaved correctly, the problem was in expanding the REF allele. It is now fixed.