How long time is needed for running samples using new created HMM Databases?

Hi Mr. @tseemann ! Thank you for creating this wonderful software tool!

Currently, I using Prokka to annotate around 2000 WGS from H. pylori. I also tried to add a pHMMs contracted by the creator of the ISEScan software to identify potential transposase. For every 50 strains, it needs around 14 hours to finish.

Here is my command:

nohup bash prokka-new.bash

and here is the bash command:

#! /bin/bash
set -e
set -u
set -o pipefail

for i in *.fasta ; 
do prokka $i --outdir $(basename -s .fasta $i) --prefix $(basename -s .fasta $i) --locustag $(basename -s .fasta $i) --centre X --compliant --cpus 20 --force --hmms /data/Ricky/Software/Miniconda/envs/myprokka/db/hmm/clustersISEScan.faa.hmm --evalue 1e-05 ; 
done

Then, the nohup said:

[11:06:31] System has 24 cores. [11:06:31] Will use maximum of 20 cores. [11:06:31] Annotating as >>> Bacteria <<< [11:06:31] Enabling options to ensure Genbank/ENA/DDJB submission compliance. [11:06:31] Creating new output folder: 651_B38 [11:06:31] Running: mkdir -p 651_B38 [11:06:31] Using filename prefix: 651_B38.XXX [11:06:31] Setting HMMER_NCPU=1 [11:06:31] Writing log to: 651_B38/651_B38.log [11:06:31] Command: /data/Ricky/Software/Miniconda/envs/myprokka/bin/prokka 651_B38.fasta --outdir 651_B38 --prefix 651_B38 --locustag 651_B38 --centre X --compliant --cpus 20 --force --hmms /data/Ricky/Software/Miniconda/envs/myprokka/db/hmm/clustersISEScan.faa.hmm --evalue 1e-05 etc.

The Prokka always running successfully but I wonder is there any way to make it faster by only using 20-24 cores? I need the RNA annotation so I keep it. I still very beginner with the command lines, could you please tell me whether we can modify something (e.g., HMMER_NCPU=1) or these 14 hours for every 50 strains are normal? If I want to run for more than 2000 WGS, it will need more than 20 days and I wonder if I can make it faster. Thank you so much.

Kind regards, Ricky

tseemann / prokka

How long time is needed for running samples using new created HMM Databases? #579