oxli-bio / oxli

k-mers and the like
BSD 3-Clause "New" or "Revised" License
15 stars 0 forks source link

Multithreaded file and string consumption #22

Open Adamtaranto opened 2 months ago

Adamtaranto commented 2 months ago

How fast can we make kmer counting in Rust?

@ctb, are there examples of this from Sourmash or other libraries?

ctb commented 2 months ago

yes, it's actually really easy ;). The trick is to make sure we're using iter and closures wherever possible; then we will just change iter to par_iter, include rayon's prelude, and voila.

This was next on my list after https://github.com/dib-lab/oxli/pull/10 gets finished off.

Adamtaranto commented 1 month ago

Might also save some time if we revcomp the full DNA seq once and pass a sliding window backwards through it (as in sourmash seqtohashes) instead of calculating rc for every kmer.

Adamtaranto commented 1 month ago

@ctb what do you think about making the user specify a thread number for rayon? I think it will try to use all available by default.

Adamtaranto commented 3 weeks ago

Some other Rust kmer counting projects for ideas: