With this PR: https://github.com/dib-lab/sourmash/pull/1009 - the sourmashNodegraph now becomes more appealing to use instead of khmer's Nodegraph. This is because the n_unique_kmers attribute that is now added, allows for computation of the minimum necessary read length for a given false positive rate with this equation:
This is low-hanging fruit for speeding up the module, but backwards compatibility may be tricky for older bloom filters. There's probably a fairly straightforward try/except thing to use here, though.
With this PR: https://github.com/dib-lab/sourmash/pull/1009 - the
sourmash
Nodegraph
now becomes more appealing to use instead ofkhmer
'sNodegraph
. This is because then_unique_kmers
attribute that is now added, allows for computation of the minimum necessary read length for a given false positive rate with this equation: