bbuchfink / diamond

Accelerated BLAST compatible local sequence aligner.
GNU General Public License v3.0
1.05k stars 182 forks source link

low RAM & CPU efficiency on slurm #595

Open schraderL opened 2 years ago

schraderL commented 2 years ago

Hi, I am running diamond blastx on some individual eukaryotic scaffolds on a HPC environment with slurm with the following settings:

diamond blastx \
        --query ${assembly} \
        --db uniprot/reference_proteomes.dmnd \
        --outfmt 6 qseqid staxids bitscore qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore \
        --fast \
        --max-target-seqs 1 \
        --evalue 1e-25 \
        --threads ${threads} \
        > ${assembly}.diamond.blastx.out

I have run this with 36 threads and 90 GB of RAM. However, CPU & RAM efficiency are both less than 5 % according to slurm:

Nodes: 1 Cores per node: 36 CPU Utilized: 02:06:08 CPU Efficiency: 2.91% of 3-00:10:12 core-walltime Job Wall-clock time: 02:00:17 Memory Utilized: 4.20 GB Memory Efficiency: 4.66% of 90.00 GB

Is there a way to improve this run so that diamond can more effectively use the resources available?

Thanks! Lukas

bbuchfink commented 2 years ago

The problem is probably the long input sequences which is not efficient in regular blastx mode, try setting -F 15. Additionally I'd recommend -b4 -c1.