Open ktmeaton opened 1 year ago
I think most of the slowdown is purely in passing large objects as parameters to genome_mp
.
Because of this, I don't think my implementation of the analysis is the biggest problem.
This might be a problem in which rust
could solve.
The code base rewrite to Rust (PR #5) was extremely helpful for this issue. On a single core, I've seen processing speeds ranging from 1-100 sequences/second. And that's not even taking --threads
into account 😀
But I am leaving this unresolved until I benchmark with a larger dataset (ex. VirusSeq).
Currently, recombination detection is slow at 5 seconds / sequences. Multiprocessing helps (
--threads
) but certainly there is code efficiency improvements needed.