EBIvariation / vcf-validator

Validation suite for Variant Call Format (VCF) files, implemented using C++11
Apache License 2.0
129 stars 39 forks source link

EVA-2188 - Exit assembly checker when no Genbank synonyms are found for a contig #210

Closed sundarvenkata-EBI closed 3 years ago

tcezard commented 3 years ago

I'm not sure adding this functionality in the assembly checker is actually what we want. I think I agree that we want to check that the contigs all have a INSDC accession but doing it in the assembly checker seems wrong since its purpose is exclusively to check the reference base. Here you're validating the assembly report instead of the VCF.

sundarvenkata-EBI commented 3 years ago

Here you're validating the assembly report instead of the VCF.

This is a fair point. But reporting the presence of non-Genbank contigs requires parsing the assembly report as well as scanning the VCF for offending contigs. Currently, the technical machinery for both are available in assembly checker. So using the assembly checker was the more pragmatic choice. As we discussed in DEV meeting today, I will 1) add a command-line flag for the user to explicitly state that a VCF is meant for submission to EVA and 2) make sure that this check is performed only when the flag is set.

Addressed in this commit.