Xacfran commented 3 years ago

Expected Behavior

Running a cluster analysis in all 128 cores.

Current Behavior

I'm running a clustering analysis to leave out redundant sequences in a big dataset of approximately 10 millon DNA sequences of lengths varying between 1150 and 1250 bp. The problem I'm having is that it looks like the program is running in a single core instead of the 128 threads I tell the program to use, hence it takes a really long time to analyze this dataset. I'm assuming it shouldn't take long based on the fact that your package is supposed to analyze even bigger datasets in a couple hours.

I ran into the same issue running CD-HIT, and this is why I'm giving a try to your package. I've already tried compiling your package using:

cmake -DHAVE_MPI=1 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=. ..

However this gave me issues when running a submission script and my job was killed with the error:

mpirun noticed that process rank 2 with PID xxxx on node xxxx exited on signal 9 (Killed).

For this reason I decided to install mmseqs in a conda environment using: conda install -c conda-forge -c bioconda mmseqs2

I know the program is running in a single core because I'm using ssh cpu- and ps aux | grep and only one process appears to be running the prefilter step right now.

Steps to Reproduce (for bugs)

!/bin/bash

SBATCH --job-name=mmseqs

SBATCH --output=%x.%j.out

SBATCH --error=%x.%j.err

SBATCH --partition=name

SBATCH --nodes=1

SBATCH --ntasks=128

mmseqs cluster DB DB_out tmp --cov-mode 1 -c 0.9 --threads 128

MMseqs Output (for bugs)

cluster DB DB_out tmp --cov-mode 1 -c 0.9 --threads 128

MMseqs Version: 13.45111 Substitution matrix nucl:nucleotide.out,aa:blosum62.out Seed substitution matrix nucl:nucleotide.out,aa:VTML80.out Sensitivity 4 k-mer length 15 k-score 2147483647 Alphabet size nucl:5,aa:21 Max sequence length 10000 Max results per query 20 Split database 0 Split mode 2 Split memory limit 0 Coverage threshold 0.9 Coverage mode 1 Compositional bias 1 Diagonal scoring false Exact k-mer matching 1 Mask residues 1 Mask lower case residues 0 Minimum diagonal score 15 Include identical seq. id. false Spaced k-mers 1 Preload mode 0 Pseudo count a 1 Pseudo count b 1.5 Spaced k-mer pattern
Local temporary path
Threads 128 Compressed 0 Verbosity 3 Add backtrace false Alignment mode 3 Alignment mode 0 Allow wrapped scoring false E-value threshold 0.001 Seq. id. threshold 0 Min alignment length 0 Seq. id. mode 0 Alternative alignments 0 Max reject 2147483647 Max accept 2147483647 Score bias 0 Realign hits false Realign score bias -0.2 Realign max seqs 2147483647 Gap open cost nucl:5,aa:11 Gap extension cost nucl:2,aa:1 Zdrop 40 Rescore mode 0 Remove hits by seq. id. and coverage false Sort results 0 Cluster mode 0 Max connected component depth 1000 Similarity type 2 Single step clustering false Cascaded clustering steps 3 Cluster reassign false Remove temporary files false Force restart with latest tmp false MPI runner
k-mers per sequence 21 Scale k-mers per sequence nucl:0.200,aa:0.000 Adjust k-mer length false Shift hash 67 Include only extendable false Skip repeating k-mers false

Set cluster sensitivity to -s 6.000000 Set cluster mode GREEDY MEM Set cluster iterations to 3 linclust DB tmp/576731152808920235/clu_redundancy tmp/576731152808920235/linclust --cluster-mode 3 --max-iterations 1000 --similarity-type 2 --threads 128 --compressed 0 -v 3 --sub-mat nucl:nucleotide.out,aa:blosum62.out -a 0 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.001 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0.9 --cov-mode 1 --max-seq-len 10000 --comp-bias-corr 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca 1 --pcb 1.5 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --gap-open nucl:5,aa:11 --gap-extend nucl:2,aa:1 --zdrop 40 --alph-size nucl:5,aa:21 --kmer-per-seq 21 --spaced-kmer-mode 1 --kmer-per-seq-scale nucl:0.200,aa:0.000 --adjust-kmer-len 0 --mask 1 --mask-lower-case 0 -k 0 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --rescore-mode 0 --filter-hits 0 --sort-results 0 --remove-tmp-files 0 --force-reuse 0

kmermatcher DB tmp/576731152808920235/linclust/1790908825406232727/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size nucl:5,aa:21 --min-seq-id 0 --kmer-per-seq 21 --spaced-kmer-mode 1 --kmer-per-seq-scale nucl:0.200,aa:0.000 --adjust-kmer-len 0 --mask 1 --mask-lower-case 0 --cov-mode 1 -k 0 -c 0.9 --max-seq-len 10000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 128 --compressed 0 -v 3

Database size: 9502826 type: Nucleotide

Generate k-mers list for 1 split [=================================================================] 9.50M 43s 925ms

Adjusted k-mer length 17 Sort kmer 0h 0m 8s 42ms Sort by rep. sequence 0h 0m 6s 419ms Time for fill: 0h 1m 9s 626ms Time for merging to pref: 0h 0m 0s 5ms Time for processing: 0h 2m 43s 136ms rescorediagonal DB DB tmp/576731152808920235/linclust/1790908825406232727/pref tmp/576731152808920235/linclust/1790908825406232727/pref_rescore1 --sub-mat nucl:nucleotide.out,aa:blosum62.out --rescore-mode 0 --wrapped-scoring 0 --filter-hits 0 -e 0.001 -c 0.9 -a 0 --cov-mode 1 --min-seq-id 0.5 --min-aln-len 0 --seq-id-mode 0 --add-self-matches 0 --sort-results 0 --db-load-mode 0 --threads 128 --compressed 0 -v 3

[=================================================================] 9.50M 1m 30s 166ms Time for merging to pref_rescore1: 0h 0m 3s 976ms Time for processing: 0h 1m 39s 656ms clust DB tmp/576731152808920235/linclust/1790908825406232727/pref_rescore1 tmp/576731152808920235/linclust/1790908825406232727/pre_clust --cluster-mode 3 --max-iterations 1000 --similarity-type 2 --threads 128 --compressed 0 -v 3

Clustering mode: Greedy Low Mem Total time: 0h 0m 1s 66ms

Size of the sequence database: 9502826 Size of the alignment database: 9502826 Number of clusters: 8785102

Writing results 0h 0m 1s 200ms Time for merging to pre_clust: 0h 0m 0s 5ms Time for processing: 0h 0m 5s 962ms createsubdb tmp/576731152808920235/linclust/1790908825406232727/order_redundancy DB tmp/576731152808920235/linclust/1790908825406232727/input_step_redundancy -v 3 --subdb-mode 1

Time for merging to input_step_redundancy: 0h 0m 0s 4ms Time for processing: 0h 0m 2s 10ms createsubdb tmp/576731152808920235/linclust/1790908825406232727/order_redundancy tmp/576731152808920235/linclust/1790908825406232727/pref tmp/576731152808920235/linclust/1790908825406232727/pref_filter1 -v 3 --subdb-mode 1

Time for merging to pref_filter1: 0h 0m 0s 5ms Time for processing: 0h 0m 3s 585ms filterdb tmp/576731152808920235/linclust/1790908825406232727/pref_filter1 tmp/576731152808920235/linclust/1790908825406232727/pref_filter2 --filter-file tmp/576731152808920235/linclust/1790908825406232727/order_redundancy --threads 128 --compressed 0 -v 3

Filtering using file(s) [=================================================================] 8.79M 22s 293ms Time for merging to pref_filter2: 0h 0m 4s 651ms Time for processing: 0h 0m 33s 252ms align tmp/576731152808920235/linclust/1790908825406232727/input_step_redundancy tmp/576731152808920235/linclust/1790908825406232727/input_step_redundancy tmp/576731152808920235/linclust/1790908825406232727/pref_filter2 tmp/576731152808920235/linclust/1790908825406232727/aln --sub-mat nucl:nucleotide.out,aa:blosum62.out -a 0 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.001 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0.9 --cov-mode 1 --max-seq-len 10000 --comp-bias-corr 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca 1 --pcb 1.5 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --gap-open nucl:5,aa:11 --gap-extend nucl:2,aa:1 --zdrop 40 --threads 128 --compressed 0 -v 3

Compute score, coverage and sequence identity Query database size: 8785102 type: Nucleotide Target database size: 8785102 type: Nucleotide Calculation of alignments [=================================================================] 8.79M 11m 6s 373ms Time for merging to aln: 0h 0m 5s 49ms 1248621641 alignments calculated 12850181 sequence pairs passed the thresholds (0.010291 of overall calculated) 1.462724 hits per query sequence Time for processing: 0h 11m 16s 565ms clust tmp/576731152808920235/linclust/1790908825406232727/input_step_redundancy tmp/576731152808920235/linclust/1790908825406232727/aln tmp/576731152808920235/linclust/1790908825406232727/clust --cluster-mode 3 --max-iterations 1000 --similarity-type 2 --threads 128 --compressed 0 -v 3

Clustering mode: Greedy Low Mem Total time: 0h 0m 1s 100ms

Size of the sequence database: 8785102 Size of the alignment database: 8785102 Number of clusters: 6459670

Writing results 0h 0m 0s 880ms Time for merging to clust: 0h 0m 0s 3ms Time for processing: 0h 0m 3s 530ms mergeclusters DB tmp/576731152808920235/clu_redundancy tmp/576731152808920235/linclust/1790908825406232727/pre_clust tmp/576731152808920235/linclust/1790908825406232727/clust --threads 128 --compressed 0 -v 3

Clustering step 1 [=================================================================] 8.79M 1s 236ms Clustering step 2 [=================================================================] 6.46M 2s 383ms Write merged clustering [=================================================================] 9.50M 3s 251ms Time for merging to clu_redundancy: 0h 0m 2s 919ms Time for processing: 0h 0m 8s 457ms createsubdb tmp/576731152808920235/clu_redundancy DB tmp/576731152808920235/input_step_redundancy -v 3 --subdb-mode 1

Time for merging to input_step_redundancy: 0h 0m 0s 5ms Time for processing: 0h 0m 1s 628ms extractframes tmp/576731152808920235/input_step_redundancy tmp/576731152808920235/query_seqs --forward-frames 1 --reverse-frames 1 --create-lookup 0 --threads 128 --compressed 0 -v 3

[=================================================================] 6.46M 24s 391ms Time for merging to query_seqs_h: 0h 0m 3s 920ms Time for merging to query_seqs: 0h 0m 19s 622ms Time for processing: 0h 0m 57s 140ms prefilter tmp/576731152808920235/query_seqs tmp/576731152808920235/input_step_redundancy tmp/576731152808920235/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -s 6 -k 15 --k-score 2147483647 --alph-size nucl:5,aa:21 --max-seq-len 10000 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 0 -c 0.9 --cov-mode 1 --comp-bias-corr 1 --diag-score 0 --exact-kmer-matching 1 --mask 1 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 128 --compressed 0 -v 3

Query database size: 12919340 type: Nucleotide Estimated memory consumption: 95G Target database size: 6459670 type: Nucleotide Index table k-mer threshold: 0 at k-mer size 15 Index table: counting k-mers [=================================================================] 6.46M 24s 264ms Index table: Masked residues: 231705872 Index table: fill [=================================================================] 6.46M 51s 182ms Index statistics Entries: 7104738736 DB size: 48845 MB Avg k-mer size: 6.616804 Top 10 k-mers GAACAACCGGCTTAG 562246 CTCACCACGAAACGG 555944 TCATGATAAGCGCTG 492357 GTTGCTCATGAAGGT 467881 CCCGTTCGTTGCAGG 454633 CCGTTGGCCAGTAAG 425430 TCTTCACTAGACCGT 407926 CTGGATGTCCACCAG 396183 GCCCTGCAACCACGG 387874 CTACCTCTCCCCTTG 382664 Time for index table init: 0h 1m 21s 797ms Process prefiltering step 1 of 1

k-mer similarity threshold: 0 Starting prefiltering scores calculation (step 1 of 1) Query db start 1 to 12919340 Target db start 1 to 6459670 [==

Context

Providing context helps us come up with a solution and improve our documentation for the future.

Your Environment

MMseqs Version: 13.45111 I'm working in a 240 CPU node cluster, my job submission intends to occupy 128 out of 128 cores available per node, each node has a total memory of 512 GB. Regarding the OS, it's running CentosOS 8.1 and uses Slurm 20.11.0 as a job resource manager, Package Build Env: Spack v0.15, Software Deployment Environment: Lmod 8.2.10 and GCC 8.3.1 as the MPI compiler.

milot-mirdita commented 3 years ago

I've run into the same issue with our own slurm. You probably need to use -c (--cpus-per-task) instead of --ntasks. The later sets the CPU affinity mask so that all threads get assigned to the same core.

Try --ntasks 1 -c 128.

milot-mirdita commented 3 years ago

Do you have more details what went wrong with the MPI build? We didn't have any recent issues with it.

Xacfran commented 3 years ago

Hi,

So last night I deleted the folder which contained the package and compiled it again without using any comes environment, and I didn't have any issues, however the issue still persists. I was doing small tests using an interactive sessions using 4 cores, and the jobs always were killed with signal 9. For what I found in the internet that happens when I run out of memory, will that be the case?

Francisco

On Tue, May 18, 2021, 6:46 AM Milot Mirdita @.***> wrote:

Do you have more details what went wrong with the MPI build? We didn't have any recent issues with it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/soedinglab/MMseqs2/issues/455#issuecomment-843100327, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMXRGM7YLDLFT2TOJ5NVU73TOJHRFANCNFSM45B6ZNIQ .

milot-mirdita commented 3 years ago

Could you post the full log of the MPI run when you get the chance? Did using -c work for the other issue?

Xacfran commented 3 years ago

When I got to run this job using mpirun I got this error:

[cpu-23-1:104192] Process received signal [cpu-23-1:104192] Signal: Bus error (7) [cpu-23-1:104192] Signal code: Non-existant physical address (2) [cpu-23-1:104192] Failing at address: 0x148bd2954aee [cpu-23-1:104230] Process received signal [cpu-23-1:104230] Signal: Bus error (7) [cpu-23-1:104230] Signal code: Non-existant physical address (2) [cpu-23-1:104230] Failing at address: 0x14758c03a9e9 [cpu-23-1:104233] Process received signal [cpu-23-1:104233] Signal: Bus error (7) [cpu-23-1:104233] Signal code: Non-existant physical address (2) [cpu-23-1:104233] Failing at address: 0x151209f6d9c4

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.

mpirun noticed that process rank 89 with PID 0 on node cpu-23-1 exited on signal 7 (Bus error).

And this one a couple more times:

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.

slurmstepd: error: JOB 1369194 ON cpu-25-49 CANCELLED AT 2021-05-17T19:07:03 DUE TO NODE FAILURE, SEE SLURMCTLD LOG FOR DETAILS

mpirun noticed that process rank 65 with PID 0 on node cpu-25-49 exited on signal 9 (Killed).

Now I'm trying to run the job using the script you also include in your manual

mmseqs createdb "${QUERYFASTA}" "${QUERYDB}" mmseqs splitdb "${QUERYDB}" "${QUERYDB}_split" --split $SPLITS

for file in $(ls "${QUERYDB}split"*_$SPLITS); do mmseqs createsubdb "${file}.index" "${QUERYDB}_h" "${file}_h" done

split=0 for file in $(ls "${QUERYDB}split"*$SPLITS); do RUNNER="mpirun -np 128 -p nocona" bsub mmseqs cluster "${file}" chiropteraDB aln${split} tmp --cov-mode 1 -c 0.9 --threads 128 ((split++)) done

So far I don't have a result out of it because the job is waiting for resources, I just added the -p nocona argument because I got an error stating that the job couldn't be batched if I didn't mention the partition.

I don't understand where I should the -c argument, I'm already using -c 0.9 , is that what you mean?

milot-mirdita commented 3 years ago

I meant the SLURM parameter --cpus-per-task that is abbreviated to -c. Sorry for the confusion.

That seems like an extremely weird error. Is the working directory in which the command is running shared between all nodes (e.g. NFS etc)?

Xacfran commented 3 years ago

I'm pasting below the whole script I'm trying to run right now:

!/bin/bash

SBATCH --job-name=chiro_mt

SBATCH --output=%x.%j.out

SBATCH --error=%x.%j.err

SBATCH --partition=nocona

SBATCH --nodes=2

SBATCH --ntasks=256

SBATCH --reservation=benchmark

module --ignore-cache load gcc/10.2.0 openmpi/4.0.4

INPUTDIR=~/input MMSEQ=~/MMseqs2/bin SPLITS=3 QUERYFASTA=all_species.fasta QUERYDB=DB

cd $INPUTDIR

mmseqs createdb "${QUERYFASTA}" "${QUERYDB}" mmseqs splitdb "${QUERYDB}" "${QUERYDB}_split" --split $SPLITS

for file in $(ls "$INPUTDIR/${QUERYDB}split"*_$SPLITS); do mmseqs createsubdb "${file}.index" "${QUERYDB}_h" "${file}_h" done

split=0 for file in $(ls "$INPUTDIR/${QUERYDB}split"*$SPLITS); do RUNNER="mpirun -np 128" bsub mmseqs cluster "${file}" DB aln${split} tmp --cov-mode 1 -c 0.9 --threads 128 ((split++)) done

So far what I understand is that the MPI version is used to run MMseqs2 on multiple servers but to run it in multiple cores, will the "simple" version work? I think that for the kind of job I'm doing right now, being able to run it on 128 cores on a single node will be enough, or is the workflow the same for what I pasted above? Thanks a lot.

milot-mirdita commented 3 years ago

You should read the SLURM documentation for the difference between --ntasks and --cpus-per-task. Generally you should only run 1 task per MPI node (or just 1 task in the simple case). And the set --cpus-per-task to the number of cores the node has. If you run more than 1 task per machine, MMseqs2 and SLURM will interact badly. The parallelization approach we use is called a Hybrid OpenMP/MPI approach.

Xacfran commented 3 years ago

Thanks for your detailed response, right now our cluster is under maintenance so I'll make sure to try what you say and close this issue if the problem is solved.

soedinglab / MMseqs2

Clustering of big DNA datasets uses one core #455

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

!/bin/bash

SBATCH --job-name=mmseqs

SBATCH --output=%x.%j.out

SBATCH --error=%x.%j.err

SBATCH --partition=name

SBATCH --nodes=1

SBATCH --ntasks=128

MMseqs Output (for bugs)

Context

Your Environment

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.

mpirun noticed that process rank 89 with PID 0 on node cpu-23-1 exited on signal 7 (Bus error).

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.

slurmstepd: error: JOB 1369194 ON cpu-25-49 CANCELLED AT 2021-05-17T19:07:03 DUE TO NODE FAILURE, SEE SLURMCTLD LOG FOR DETAILS

mpirun noticed that process rank 65 with PID 0 on node cpu-25-49 exited on signal 9 (Killed).

!/bin/bash

SBATCH --job-name=chiro_mt

SBATCH --output=%x.%j.out

SBATCH --error=%x.%j.err

SBATCH --partition=nocona

SBATCH --nodes=2

SBATCH --ntasks=256

SBATCH --reservation=benchmark