EBIvariation / vcf-validator

Validation suite for Variant Call Format (VCF) files, implemented using C++11
Apache License 2.0
130 stars 39 forks source link

Strange duplicated variant problem without ref allele ? #227

Closed sajeevbatra closed 6 months ago

sajeevbatra commented 1 year ago

Hello, Thanks for your VCF validator tool. After using it, I see the following vcf validator error that is flagged: Error: Duplicated variant 7:5302279:>A found. This occurs 2 time(s), first time in line 317797. What does this mean? What happened to the ref_allele? Is this truly an error?

Original VCF: 7 5302278 ABC T TA (line 317797) 7 5302279 ABX C A,AC

How do I fix this error? Thanks.

tcezard commented 1 year ago

Hi @sajeevbatra This is a genuine error in your VCF: 7 5302278 ABC T TA describes the following variant:

    5302278
          |
REF:      T-C
ALT:      TAC

The following line describes 3 alleles: 7 5302279 ABX C A,AC

     5302278
           |
REF:       TC
ALT1:      TA
ALT2:      TAC

As you can see the first line ALT describe the same allele as the second line ALT2 which is what is being reported by the validator.