ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
481 stars 106 forks source link

use bcftools view to sanity check every vcf #1417

Closed glennhickey closed 1 week ago

glennhickey commented 1 week ago

vg deconstruct can apparently produce invalid vcfs (#1402 #1416). This PR adds bcftools stats after every VCF is created, with the hope of catching any errors right away. Hopefully I can get the data to fix deconstruct ASAP, but it's probably a good idea in general to spend a few minutes validating vcfs (it has been well worth it to add vg validate for all the various graphs that are created).

The stats are also written to the output in <vcf path>.stats, which may be convenient for some

update view seems better than stats at catching formatting errors in some cases, so switched to that instead. so no extra stats returned after all