Closed tedtoal closed 7 years ago
It is expensive to validate and fix a VCF exhaustively, the format can be violated in too many ways. BCFtools assume a valid VCF on input, with some exceptions. Although it would be easy to add a support for this particular case and the issue #507, the right solution here is to open an issue with VARSCAN, they should produce a valid VCF.
I have opened issues with VARSCAN. Still, this is so common, it would be nice to have a tool to support fixing it. I understand that it is a problem to fix the multitude of possible violations, yet having a tool to do that would be valuable. I thought bcftools was the most logical one to add support for fixing at least some of the more common violations, but maybe this calls for a dedicated "VCF standardization" tool.
I agree with @pd3. I don't think separating alleles with "/" is common – never seen it elsewhere. All mainstream variant callers are using ",".
VCF files produced by VARSCAN (v 2.4.2) separate multiple alleles in the Alt field with slashes instead of commas. It would be nice if the bcftools norm command could fix this.