samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
657 stars 240 forks source link

varscan vcf file "/" in Alt field #506

Closed tedtoal closed 7 years ago

tedtoal commented 7 years ago

VCF files produced by VARSCAN (v 2.4.2) separate multiple alleles in the Alt field with slashes instead of commas. It would be nice if the bcftools norm command could fix this.

pd3 commented 7 years ago

It is expensive to validate and fix a VCF exhaustively, the format can be violated in too many ways. BCFtools assume a valid VCF on input, with some exceptions. Although it would be easy to add a support for this particular case and the issue #507, the right solution here is to open an issue with VARSCAN, they should produce a valid VCF.

tedtoal commented 7 years ago

I have opened issues with VARSCAN. Still, this is so common, it would be nice to have a tool to support fixing it. I understand that it is a problem to fix the multitude of possible violations, yet having a tool to do that would be valuable. I thought bcftools was the most logical one to add support for fixing at least some of the more common violations, but maybe this calls for a dedicated "VCF standardization" tool.

lh3 commented 7 years ago

I agree with @pd3. I don't think separating alleles with "/" is common – never seen it elsewhere. All mainstream variant callers are using ",".