fedarko / strainFlye

Pipeline for analyzing (rare) mutations in metagenome-assembled genomes
BSD 3-Clause "New" or "Revised" License
8 stars 1 forks source link

Extra sanity checking: when multiple inputs are given, check that contig lengths match? #33

Closed fedarko closed 2 years ago

fedarko commented 2 years ago

Sort of a sequel to #32.

We have contig lengths easily available for BCF files (since we enforce this information is given in the header for each contig) and for BAM files (using the .lengths property of a pysam.AlignmentFile object). It'd be nice to add these checks at the start of each command (should be able to abstract the actual work to a utility function).

I don't imagine that this will come up in practice much, but I'm sure it'll happen sooner or later -- since failing to account for this will cause a smorgasbord of silly errors.

fedarko commented 2 years ago

Closing, since everything that can use these checks now does!