Closed qkqk-hub closed 1 year ago
Hi, the reason is 1) hmmsearch
step can use 4 threads in max in my experience; 2) CPU usage per job is limited by total CPUs you have in your computer and total # of jobs you are running.
Ok, thank you for your answer.
Hello All,
Even I use only one job, hmmsearch use only 1 thread. I have 24 threads, hmmsearch is the limiting step. It should be much faster using more threads since it is essentially embarrassingly parallel, for each protein sequences, hmmsearch can initialize a thread. Can you please add a thread option to tell hmmsearch how many threads to use? In cases where I have only one file/genome, using just one thread is too slow.
Thanks, Jianshu
Hi, you can do virsorter config --set HMMSEARCH_THREADS=4
. Hmmsearch's multi-threading does NOT work well, usually the IO is the bottleneck, not the CPU though.
Thanks, I need to set thread each time I ran it I assume? Anyway problem solved. Many thanks!
Jianshu
No, the setting should still work next time. It's recorded in the template-config.yaml
file.
Here is my code:
for i in SRR*; do cd $i;virsorter run -w /data_alluser/QK/NewMAGs/PRJDB4176/09_virsorter2/"$i"/ -i ./final.contigs.fa -j 30 --include-groups "dsDNAphage,NCLDV,RNA,ssDNA,lavidaviridae" all;cd ..; done
I use 30 threads in my code, Why isn't my cpu using 3000%? But very few. Here is a screenshot of my cpu usage: Thanks!