EBIvariation / vcf-validator

Validation suite for Variant Call Format (VCF) files, implemented using C++11
Apache License 2.0
129 stars 39 forks source link

vcf_validator incorrectly marks missing value as error #223

Closed mark-lubberts closed 11 months ago

mark-lubberts commented 1 year ago

According to vcf_validator (v0.9.4), the following vcf line is invalid, because the 'missing' sample doesn't have 3 values for GL:

MWJL01000001.1 2 . ACGGCTGTAGCGATTGTCG ACGGCTGTGGCGATTGTCG 1313.56 . . GT:DP:RO:QR:AO:QA:GL ./.:6:6:239:0:0:0,-1.80618,-21.5877 0/1:35:10:408:25:980:-75.6366,0,-24.9462 0/0:39:35:1285:0:0:0,-10.536,-112.341 0/0:11:11:407:0:0:0,-3.31133,-32.8794 0/0:25:24:880:0:0:0,-7.22472,-78.8649 ./.:2:2:72:0:0:0,-0.60206,-6.64108 0/0:12:12:479:0:0:0,-3.61236,-41.6227 0/0:23:23:842:0:0:0,-6.92369,-73.4174 ./.:8:8:293:0:0:0,-2.40824,-21.7511 0/0:26:26:954:0:0:0,-7.82678,-80.18 0/0:28:27:992:0:0:0,-8.12781,-88.5667 0/0:54:54:1979:0:0:0,-16.2556,-176.966 0/0:22:22:881:0:0:0,-6.62266,-78.4785 ./.:.:.:.:.:.:. 0/0:21:21:770:0:0:0,-6.32163,-69.0186 0/1:36:7:252:29:1069:-83.9418,0,-11.8107 0/0:41:41:1515:0:0:0,-12.3422,-129.452 ./.:2:2:81:0:0:0,-0.60206,-7.68573 0/0:23:23:841:0:0:0,-6.92369,-70.8629 0/0:16:15:604:0:0:0,-4.51545,-50.2643 0/0:23:23:842:0:0:0,-6.92369,-73.906 0/0:24:24:884:0:0:0,-7.22472,-74.5307 0/0:17:17:618:0:0:0,-5.11751,-54.0436 0/0:31:30:1209:0:0:0,-9.0309,-103.707 0/0:25:25:918:0:0:0,-7.52575,-76.4784

Error: Sample #14, field GL does not match the meta specification Number=G (expected 3 value(s)). This occurs 737 time(s), first time in line 9495.

However, the vcf specification states in section 1.6.2 that:

If a field contains a list of missing values, it can be represented either as a single MISSING value (‘.’) or as a list of missing values (e.g. ‘.,.,.’ if the field was Number=3).

Which means the single . for GL in sample 16 should be valid.

tcezard commented 11 months ago

Addressed in #224