onecodex / finch-rs

A genomic minhashing implementation in Rust
https://www.onecodex.com
MIT License
92 stars 8 forks source link

dist takes parameters that are silently ignored when passed sketch files #28

Closed hgbrian closed 4 years ago

hgbrian commented 5 years ago

This works as expected: finch dist -p -n100 -k100 *.fasta This does not work as expected: finch dist -p -n100 -k100 *.fasta.sk

The parameters k and n appear to be ignored when a sketch is passed to dist. (It does check if k>256 though.)

It makes sense that k and n would have no effect since the sketch already has a kmer length and a num hashes. However, this caused me some confusion for a while...

bovee commented 5 years ago

We should almost certainly blow up in this case.

The code path here is complicated in that it tries to use the sketch to determine the parameters so that e.g. finch dist can intelligently know how to compare a sketch to a FASTA. Unfortunately, this seems to be too complicated in that we've hit a few cases where how we choose the reference parameters is completely counterintuitive (see also #24) and we should probably clean up how we're doing argument parsing and parameter selection.

hgbrian commented 5 years ago

works for me, thanks!