jiarong / VirSorter2

customizable pipeline to identify viral sequences from (meta)genomic data
GNU General Public License v2.0
225 stars 31 forks source link

The hmmsearch -T {threads} does not reach to its full capacity #133

Closed ChaoLab closed 2 years ago

ChaoLab commented 2 years ago

Hi, When I am using the SOP using VS2, checkV to identify viruses, I found that "The hmmsearch -T {threads}" does not reach to its full capacity but only used 2 threads. It seems to be an intrinsic issue for all hmmsearch runners. I am wondering if this can be solved by subprocessing hmmsearch runs for the next version

jiarong commented 2 years ago

Hi Chao, thanks for the feedback. The "subprocessing" method is implemented in the current version. The issue is the current viral protein HMM database is too large and even the split data can take a long time.