Closed Gmar999 closed 5 months ago
Hi @Gmar999,
An issue I can see in the way you benchmark: by calling hmmsearch
directly without specifying --cpu
, you always use 2 threads, not 48 like you tell pyhmmer
to do. Since you have few HMMs, there is no point in running with cpus=48
, try with cpus=2
and check if the times are still off.
Thank you very much for your prompt response, Martin. I changed the parameter from cpus=48 to cpus=2, and then ran today's test. Here are my test results:
;ND3 pyhmmer-hmmsearch search ND3.fasta(2815 seqs) using: 70.86706924438477 hmmsearch search ND3.fasta(2815 seqs) using : 21.94244623184204 ;COX2 pyhmmer-hmmsearch search COX2.fasta(8290 seqs) using: 619.4518868923187 hmmsearch search COX2.fasta(8290 seqs) using : 68.149094581604
It seems that the slowest search is dragging down the entire search process. During the search of COX2.fasta, the search gets stuck on using COX2's prior to search COX2. And hmmsearch seems to be faster for individual hmm searches. But why is there such a big difference in speed?
Would you mind updloading the test files (or you can send them through e-mail if that's sensitive data) so I can check how to reproduce?
hmmsearch
will be faster for individual HMM searches because PyHMMER parallelizes on the HMMs rather than the sequences, so if you have a single HMM it cannot use more than one thread. That was done because our use-case (Pfam annotation of complete genomes) always has more HMMs we can use for parallelizing. I documented that there: https://pyhmmer.readthedocs.io/en/stable/performance.html
Okay, here is my data. test.tar.gz
HI @Gmar999, thanks for the data.
I compared the runs of HMMER and PyHMMER, measuring the CPU cycles with valgrind
, and I only got a marginal difference (239M cycles with PyHMMER, 236M with HMMER). What this suggests is that the problem does not come from the computation, but from the parallel-programming part which seems to deadlock for ages in recent PyHMMER versions.
This looks like a regression introduced in recent versions, because when I tested with v0.8.0
I got more reasonable timings (PyHMMER still being a little bit slower, but expected because of the parallelization model).
With PyHMMER v0.8.0
:
pyhmmer-hmmsearch search ATP6.fasta(2786) using: 26.183855056762695
hmmsearch search ATP6.fasta using(2786) : 17.281102657318115
With PyHMMER v0.10.10
:
pyhmmer-hmmsearch search ATP6.fasta(2786) using: 141.94732308387756
hmmsearch search ATP6.fasta using(2786) : 17.341574668884277
It turns out after refactoring the code to compile Easel and HMMER in https://github.com/althonos/pyhmmer/commit/b3d71d836b7312791d16c2af1b2f7b4ec43ccaa9 I forgot to include so flags to compile the extension for SSE4.1, so all releases since v0.10.0
have been running without SSE4.1 enabled, hence the massive slowdown.
Please update to v0.10.11
, PyHMMER will still be a tiny bit slower (as expected, because most of the computation time is taken by a single HMM, and that will always be faster in HMMER because the threading work slightly better in this situation, I'd have to implement another parallelization strategy to improve that), but now we're talking 25s instead of 20s, not 500s.
Thanks again for allowing me to find this issue!
Hello, Martin! Thank you for your work. I'm trying to use pyhmmer instead on Linux. However, I find that pyhmmer always works slower than using hmmsearch directly. My test is based on 13 gene hmms and several fasta files of already identified genes.
Here is my script:
And here is the result:
I want to know what happened. Is it because of the method I am using? Thank you very much for your help.