EBIvariation / vcf-validator

Validation suite for Variant Call Format (VCF) files, implemented using C++11
Apache License 2.0
129 stars 39 forks source link

EVA-494 include error counts in the report #85

Closed jmmut closed 6 years ago

jmmut commented 6 years ago

Currently we have a StdoutReport that writes one line for each error. A more user friendly report would be one where each error type is only reported once with a count of appearances. Optionally with the first error of each type as example.

examples:

change from this:

Line 174: ALT metadata ID does not begin with DEL/INS/DUP/INV/CNV
Line 175: ALT metadata ID does not begin with DEL/INS/DUP/INV/CNV
Line 176: ALT metadata ID does not begin with DEL/INS/DUP/INV/CNV
Line 201: Error in INFO metadata
Line 203: Info 'ssID' is not listed in a valid meta-data INFO entry (warning)
According to the VCF v4.1 specification, the input file is not valid

to this:

ALT metadata ID does not begin with DEL/INS/DUP/INV/CNV. This occurs 3 times, first time in line 174.
Error in INFO metadata. This occurs 3 times, first time in line 201.
Info 'ssID' is not listed in a valid meta-data INFO entry (warning). This occurs 300 times, first time in line 203.
According to the VCF v4.1 specification, the input file is not valid

maybe copying the first occurrence (don't focus too much on this if it's too hard, it's not necessary):

INFO AC value must be a non-negative integer number. This occurs 300 times, first time in line 14. Original text: AC=-1