Using bcftools for initial VCF parsing and sorting in the scripts/*_validate_input.py scripts. This also means we don't need to worry about the input VCF being zipped or unzipped, since bcftools handles both.
Keeping only autosomal, X, Y and M/MT chromosomes (i.e. discarding variants on alt contigs).
My tests seem to work, hopefully we shouldn't have corner cases for these relatively simple changes. bcftools sort goes with 1-22, X, Y, M (--edit: actually, it works based on the contig order in the metadata ;-)).
@sigven please sanity check those filters, the logic seems to match what we had previously.
scripts/*_validate_input.py
scripts. This also means we don't need to worry about the input VCF being zipped or unzipped, since bcftools handles both.My tests seem to work, hopefully we shouldn't have corner cases for these relatively simple changes.
bcftools sort
goes with 1-22, X, Y, M (--edit: actually, it works based on the contig order in the metadata ;-)). @sigven please sanity check those filters, the logic seems to match what we had previously.