soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.31k stars 184 forks source link

Error: Prefilter & Search step died with mmseqs 15.6f452 easy-cluster #860

Open Danderson123 opened 6 days ago

Danderson123 commented 6 days ago

Expected Behavior

mmseqs easy-cluster should finish without errors.

Current Behavior

Query database size: 19552 type: Nucleotide
Estimated memory consumption: 8G
Target database size: 9776 type: Nucleotide
tmp/3198441352783276465/clu_tmp/13016959338117486175/nucleotide_clustering.sh: line 48: 972066 Killed                  $RUNNER "$MMSEQS" prefilter "$QUERY" "$INPUT" "${TMP_PATH}/pref" ${PREFILTER_PAR}
Error: Prefilter step died
Error: Search died

Steps to Reproduce (for bugs)

Please make sure to execute the reproduction steps with newly recreated and empty tmp folders.

1) Download the FASTA at this link https://drive.google.com/file/d/1YPNMj2gL8zNUv9aiRo7dLJawanDWCIb3/view?usp=drive_link 2) Install mmseqs2 v15.6f452 3) Run:

mmseqs easy-cluster all_sequences.fasta  mmseqs_output tmp --cluster-mode 1 --cluster-reassign 1 --threads 24 -c 0.0 --cov-mode 5 --min-seq-id 0.8

MMseqs Output (for bugs)

Please make sure to also post the complete output of MMseqs. You can use gist.github.com for large output.

Create directory tmp
easy-cluster /hps/nobackup/iqbal/dander/amira_panRG_pipeline/Escherichia_coli_panRG_c_0.8_l_0_train_AMR_alleles_removed_mmseqs2/all_sequences.fasta mmseqs_output/mmseqs_output tmp --cluster-mode 1 --cluster-reassign 1 --threads 24 -c 0.0 --cov-mode 5 --min-seq-id 0.8 

MMseqs Version:                         15.6f452
Substitution matrix                     aa:blosum62.out,nucl:nucleotide.out
Seed substitution matrix                aa:VTML80.out,nucl:nucleotide.out
Sensitivity                             4
k-mer length                            0
Target search mode                      0
k-score                                 seq:2147483647,prof:2147483647
Alphabet size                           aa:21,nucl:5
Max sequence length                     65535
Max results per query                   20
Split database                          0
Split mode                              2
Split memory limit                      0
Coverage threshold                      0
Coverage mode                           5
Compositional bias                      1
Compositional bias                      1
Diagonal scoring                        true
Exact k-mer matching                    0
Mask residues                           1
Mask residues probability               0.9
Mask lower case residues                0
Minimum diagonal score                  15
Selected taxa                       
Include identical seq. id.              false
Spaced k-mers                           1
Preload mode                            0
Pseudo count a                          substitution:1.100,context:1.400
Pseudo count b                          substitution:4.100,context:5.800
Spaced k-mer pattern                
Local temporary path                
Threads                                 24
Compressed                              0
Verbosity                               3
Add backtrace                           false
Alignment mode                          3
Alignment mode                          0
Allow wrapped scoring                   false
E-value threshold                       0.001
Seq. id. threshold                      0.8
Min alignment length                    0
Seq. id. mode                           0
Alternative alignments                  0
Max reject                              2147483647
Max accept                              2147483647
Score bias                              0
Realign hits                            false
Realign score bias                      -0.2
Realign max seqs                        2147483647
Correlation score weight                0
Gap open cost                           aa:11,nucl:5
Gap extension cost                      aa:1,nucl:2
Zdrop                                   40
Rescore mode                            0
Remove hits by seq. id. and coverage    false
Sort results                            0
Cluster mode                            1
Max connected component depth           1000
Similarity type                         2
Weight file name                    
Cluster Weight threshold                0.9
Single step clustering                  false
Cascaded clustering steps               3
Cluster reassign                        true
Remove temporary files                  true
Force restart with latest tmp           false
MPI runner                          
k-mers per sequence                     21
Scale k-mers per sequence               aa:0.000,nucl:0.200
Adjust k-mer length                     false
Shift hash                              67
Include only extendable                 false
Skip repeating k-mers                   false
Database type                           0
Shuffle input database                  true
Createdb mode                           1
Write lookup file                       0
Offset of numeric ids                   0

createdb /hps/nobackup/iqbal/dander/amira_panRG_pipeline/Escherichia_coli_panRG_c_0.8_l_0_train_AMR_alleles_removed_mmseqs2/all_sequences.fasta tmp/3198441352783276465/input --dbtype 0 --shuffle 1 --createdb-mode 1 --write-lookup 0 --id-offset 0 --compressed 0 -v 3 

Shuffle database cannot be combined with --createdb-mode 0
We recompute with --shuffle 0
Converting sequences
[95951] 0s 927ms
Time for merging to input_h: 0h 0m 0s 47ms
Time for merging to input: 0h 0m 0s 25ms
Database type: Nucleotide
Time for processing: 0h 0m 1s 143ms
Create directory tmp/3198441352783276465/clu_tmp
cluster tmp/3198441352783276465/input tmp/3198441352783276465/clu tmp/3198441352783276465/clu_tmp --max-seqs 20 -c 0 --cov-mode 5 --spaced-kmer-mode 1 --threads 24 --alignment-mode 3 -e 0.001 --min-seq-id 0.8 --cluster-mode 1 --cluster-reassign 1 --remove-tmp-files 1 

Set cluster sensitivity to -s 1.000000
Connected component clustering produces less clusters in a single step clustering.
Please use --single-step-clusteringSet cluster iterations to 1
linclust tmp/3198441352783276465/input tmp/3198441352783276465/clu_tmp/13016959338117486175/clu_redundancy tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust --cluster-mode 1 --max-iterations 1000 --similarity-type 2 --threads 24 --compressed 0 -v 3 --cluster-weight-threshold 0.9 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 0 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.001 --min-seq-id 0.8 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 5 --max-seq-len 10000 --comp-bias-corr 0 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --alph-size aa:21,nucl:5 --kmer-per-seq 21 --spaced-kmer-mode 1 --kmer-per-seq-scale aa:0.000,nucl:0.200 --adjust-kmer-len 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 -k 0 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --rescore-mode 0 --filter-hits 0 --sort-results 0 --remove-tmp-files 1 --force-reuse 0 

kmermatcher tmp/3198441352783276465/input tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --alph-size aa:21,nucl:5 --min-seq-id 0.8 --kmer-per-seq 21 --spaced-kmer-mode 1 --kmer-per-seq-scale aa:0.000,nucl:0.200 --adjust-kmer-len 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --cov-mode 5 -k 0 -c 0 --max-seq-len 10000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 24 --compressed 0 -v 3 --cluster-weight-threshold 0.9 

kmermatcher tmp/3198441352783276465/input tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --alph-size aa:21,nucl:5 --min-seq-id 0.8 --kmer-per-seq 21 --spaced-kmer-mode 1 --kmer-per-seq-scale aa:0.000,nucl:0.200 --adjust-kmer-len 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --cov-mode 5 -k 0 -c 0 --max-seq-len 10000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 24 --compressed 0 -v 3 --cluster-weight-threshold 0.9 

Database size: 96025 type: Nucleotide

Generate k-mers list for 1 split
[=================================================================] 100.00% 96.03K 0s 882ms    

Adjusted k-mer length 17
Sort kmer 0h 0m 0s 257ms
Sort by rep. sequence 0h 0m 0s 133ms
Time for fill: 0h 0m 0s 107ms
Time for merging to pref: 0h 0m 0s 28ms
Time for processing: 0h 0m 1s 808ms
rescorediagonal tmp/3198441352783276465/input tmp/3198441352783276465/input tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_rescore1 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --rescore-mode 0 --wrapped-scoring 0 --filter-hits 0 -e 0.001 -c 0.5 -a 0 --cov-mode 5 --min-seq-id 0.8 --min-aln-len 0 --seq-id-mode 0 --add-self-matches 0 --sort-results 0 --db-load-mode 0 --threads 24 --compressed 0 -v 3 

[=================================================================] 100.00% 96.03K 0s 151ms     
Time for merging to pref_rescore1: 0h 0m 0s 241ms================>] 99.99% 96.02K eta 0s        
Time for processing: 0h 0m 1s 33ms
clust tmp/3198441352783276465/input tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_rescore1 tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pre_clust --cluster-mode 1 --max-iterations 1000 --similarity-type 2 --threads 24 --compressed 0 -v 3 --cluster-weight-threshold 0.9 

Clustering mode: Connected Component
[=================================================================] 100.00% 96.03K 0s 106ms    
Sort entries
Find missing connections
Found 292030 new connections.
Reconstruct initial order
[=================================================================] 100.00% 96.03K 0s 75ms     
Add missing connections
[=================================================================] 100.00% 96.03K 0s 10ms     

Time for read in: 0h 0m 0s 294ms
connected component mode
Total time: 0h 0m 0s 339ms

Size of the sequence database: 96025
Size of the alignment database: 96025
Number of clusters: 10913

Writing results 0h 0m 0s 4ms
Time for merging to pre_clust: 0h 0m 0s 11ms
Time for processing: 0h 0m 0s 464ms
createsubdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/order_redundancy tmp/3198441352783276465/input tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/input_step_redundancy -v 3 --subdb-mode 1 

Time for merging to input_step_redundancy: 0h 0m 0s 7ms
Time for processing: 0h 0m 0s 54ms
createsubdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/order_redundancy tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_filter1 -v 3 --subdb-mode 1 

Time for merging to pref_filter1: 0h 0m 0s 15ms
Time for processing: 0h 0m 0s 62ms
filterdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_filter1 tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_filter2 --filter-file tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/order_redundancy --threads 24 --compressed 0 -v 3 

Filtering using file(s)
[=================================================================] 100.00% 10.91K 0s 66ms     
Time for merging to pref_filter2: 0h 0m 0s 170ms
Time for processing: 0h 0m 0s 603ms
align tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/input_step_redundancy tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/input_step_redundancy tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_filter2 tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/aln --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 0 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.001 --min-seq-id 0.8 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 5 --max-seq-len 10000 --comp-bias-corr 0 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 24 --compressed 0 -v 3 

Compute score, coverage and sequence identity
Query database size: 10913 type: Nucleotide
Target database size: 10913 type: Nucleotide
Calculation of alignments
[=================================================================] 100.00% 10.91K 0s 56ms     
Time for merging to aln: 0h 0m 0s 218ms
15033 alignments calculated
12475 sequence pairs passed the thresholds (0.829841 of overall calculated)
1.143132 hits per query sequence
Time for processing: 0h 0m 0s 663ms
clust tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/input_step_redundancy tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/aln tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/clust --cluster-mode 1 --max-iterations 1000 --similarity-type 2 --threads 24 --compressed 0 -v 3 --cluster-weight-threshold 0.9 

Clustering mode: Connected Component
[=================================================================] 100.00% 10.91K 0s 8ms      
Sort entries
Find missing connections
Found 1562 new connections.
Reconstruct initial order
[=================================================================] 100.00% 10.91K 0s 120ms    
Add missing connections
[=================================================================] 100.00% 10.91K 0s 0ms      

Time for read in: 0h 0m 0s 235ms
connected component mode
Total time: 0h 0m 0s 261ms

Size of the sequence database: 10913
Size of the alignment database: 10913
Number of clusters: 9776

Writing results 0h 0m 0s 1ms
Time for merging to clust: 0h 0m 0s 16ms
Time for processing: 0h 0m 0s 316ms
mergeclusters tmp/3198441352783276465/input tmp/3198441352783276465/clu_tmp/13016959338117486175/clu_redundancy tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pre_clust tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/clust --threads 24 --compressed 0 -v 3 

Clustering step 1
[=================================================================] 100.00% 10.91K 0s 67ms     
Clustering step 2
[=================================================================] 100.00% 9.78K 0s 122ms    
Write merged clustering
[=================================================================] 100.00% 96.03K 0s 496ms    
Time for merging to clu_redundancy: 0h 0m 0s 270ms
Time for processing: 0h 0m 0s 979ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_filter1 -v 3 

Time for processing: 0h 0m 0s 16ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref -v 3 

Time for processing: 0h 0m 0s 10ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_rescore1 -v 3 

Time for processing: 0h 0m 0s 110ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pre_clust -v 3 

Time for processing: 0h 0m 0s 29ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/input_step_redundancy -v 3 

Time for processing: 0h 0m 0s 19ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/input_step_redundancy_h -v 3 

Time for processing: 0h 0m 0s 9ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/pref_filter2 -v 3 

Time for processing: 0h 0m 0s 77ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/aln -v 3 

Time for processing: 0h 0m 0s 112ms
rmdb tmp/3198441352783276465/clu_tmp/13016959338117486175/linclust/12836794075397166753/clust -v 3 

Time for processing: 0h 0m 0s 12ms
createsubdb tmp/3198441352783276465/clu_tmp/13016959338117486175/clu_redundancy tmp/3198441352783276465/input tmp/3198441352783276465/clu_tmp/13016959338117486175/input_step_redundancy -v 3 --subdb-mode 1 

Time for merging to input_step_redundancy: 0h 0m 0s 21ms
Time for processing: 0h 0m 0s 92ms
extractframes tmp/3198441352783276465/clu_tmp/13016959338117486175/input_step_redundancy tmp/3198441352783276465/clu_tmp/13016959338117486175/query_seqs --forward-frames 1 --reverse-frames 1 --create-lookup 0 --threads 24 --compressed 0 -v 3 

[=================================================================] 100.00% 9.78K 0s 59ms     
Time for merging to query_seqs_h: 0h 0m 0s 439ms
Time for merging to query_seqs: 0h 0m 0s 494ms
Time for processing: 0h 0m 2s 117ms
prefilter tmp/3198441352783276465/clu_tmp/13016959338117486175/query_seqs tmp/3198441352783276465/clu_tmp/13016959338117486175/input_step_redundancy tmp/3198441352783276465/clu_tmp/13016959338117486175/pref --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -s 1 -k 15 --target-search-mode 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 10000 --max-seqs 20 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 5 --comp-bias-corr 0 --comp-bias-corr-scale 1 --diag-score 0 --exact-kmer-matching 1 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 60 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 24 --compressed 0 -v 3 

Query database size: 19552 type: Nucleotide
Estimated memory consumption: 8G
Target database size: 9776 type: Nucleotide
tmp/3198441352783276465/clu_tmp/13016959338117486175/nucleotide_clustering.sh: line 48: 1648954 Killed                  $RUNNER "$MMSEQS" prefilter "$QUERY" "$INPUT" "${TMP_PATH}/pref" ${PREFILTER_PAR}
Error: Prefilter step died
Error: Search died

Context

Providing context helps us come up with a solution and improve our documentation for the future.

I am trying to cluster a number of gene sequences with an identity of 0.8 and no minimum length for the aligned portion of the genes.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

Danderson123 commented 4 days ago

This issue seems to come up fairly regularly so for anyone looking for a solution, mmseqs v14.7e284 worked on these sequences for me.

milot-mirdita commented 4 days ago

Killed means that MMseqs2 didn't have enough RAM and was killed by the OOM killer.

I recommend to assign nodes exclusively to the MMseqs2 job, so another job can't steal the RAM. Alternatively, you can use the --split-memory-parameter to approximately set how much RAM MMseqs2 is allowed to use.

The parameter decides the size of the chunking with the prefilter, so it does not directly map to total RAM use. I recommend setting to. about 80% of the RAM you want to allow MMseqs2 to use.

So assuming you reserve 64GB RAM, you can pass --split-memory-limit 50G to make MMseqs2 behave better with other software that require a lot of RAM.