onecodex / finch-rs

A genomic minhashing implementation in Rust
https://www.onecodex.com
MIT License
92 stars 8 forks source link

Internal refactoring and support for binary Mash format #23

Closed bovee closed 6 years ago

bovee commented 6 years ago

This adds support for reading and writing Mash-formatted binary files (reading support is transparent as long as they have the msh extension and writing support is enabled via the --binary-format flag). This feature is gated behind a feature flag, but enabled by default (so to compile without requires --no-default-features). Note too that you have to set the hash seed to 42 (--hash 42) for strict comparability in Mash or you'll get various errors (see https://github.com/marbl/Mash/issues/74).

This is also a big clean up of some of our internal Sketch classes that will affect downstream (both internally and see https://github.com/luizirber/sourmash-rust/pull/1 for an example).