quick question: is sourmash sketch dna now multithreaded in 4.8.11? My understanding was that it was single threaded, but I just noticed via top and time that a single sourmash sketch dna command is taking something like 100+ threads on my server (i.e. %CPU around 20000%). And no updated in the docs that mentions controlling the number of threads used...
first - sourmash-rs uses rayon for parallelism, and for rayon, the environmental variable RAYON_NUM_THREADS can be used to control the expected number of threads. Setting this to e.g. 16 should limit rayon to using 16 threads.
second - my understanding is that Rust-based parallelism is enabled conditionally in the Rust codebase, using the parallel feature (which is automatically turned on by the branchwater feature, as in e.g. the branchwater plugin). It looks like this may be enabled by default in pyproject.toml:
which is cool, if so, but should probably be documented somewhere! ;)
aaaaand third - yes, it looks like sketching operates in parallel at the level of multiple sketch types, e.g if you are doing a bunch of different k-mer sizes, then each k-mer size is sketched ! see:
@dkoslicki asks on matrix:
first - sourmash-rs uses rayon for parallelism, and for rayon, the environmental variable
RAYON_NUM_THREADS
can be used to control the expected number of threads. Setting this to e.g. 16 should limit rayon to using 16 threads.second - my understanding is that Rust-based parallelism is enabled conditionally in the Rust codebase, using the
parallel
feature (which is automatically turned on by thebranchwater
feature, as in e.g. the branchwater plugin). It looks like this may be enabled by default inpyproject.toml
:https://github.com/sourmash-bio/sourmash/blob/c7fc46012dcf9156003dfc58d60a124f0f480e9c/pyproject.toml#L153
which is cool, if so, but should probably be documented somewhere! ;)
aaaaand third - yes, it looks like sketching operates in parallel at the level of multiple sketch types, e.g if you are doing a bunch of different k-mer sizes, then each k-mer size is sketched ! see:
https://github.com/sourmash-bio/sourmash/blob/c7fc46012dcf9156003dfc58d60a124f0f480e9c/src/core/src/signature.rs#L661-L668
so that's cool :).
So my end take is: I think we should probably document this somewhere!