We have contig lengths easily available for BCF files (since we enforce this information is given in the header for each contig) and for BAM files (using the .lengths property of a pysam.AlignmentFile object). It'd be nice to add these checks at the start of each command (should be able to abstract the actual work to a utility function).
I don't imagine that this will come up in practice much, but I'm sure it'll happen sooner or later -- since failing to account for this will cause a smorgasbord of silly errors.
[x] Add a utility function for testing that BAM/BCF files' lengths match FASTA lengths
[x] Modify phasing_utils.load_triplet() to use both these checks
[x] Modify call to use the BAM-FASTA check
[x] Modify fdr estimate to use the BCF-FASTA check
[ ] Modify matrix compute to use the BAM-FASTA check (when it's implemented...)
[x] Modify align to check that FASTA lengths match lengths in the GFA file
Sort of a sequel to #32.
We have contig lengths easily available for BCF files (since we enforce this information is given in the header for each contig) and for BAM files (using the
.lengths
property of apysam.AlignmentFile
object). It'd be nice to add these checks at the start of each command (should be able to abstract the actual work to a utility function).I don't imagine that this will come up in practice much, but I'm sure it'll happen sooner or later -- since failing to account for this will cause a smorgasbord of silly errors.
phasing_utils.load_triplet()
to use both these checks