mmseqs much slower than the MMseqs2 MSA server

zhanxw commented 2 years ago

Expected Behavior

The analysis finished in minutes on MMSeq2 MSA server using colabfold

Current Behavior

Local mmseqs always paused for hours without generating outputs

Steps to Reproduce (for bugs)

Please make sure to execute the reproduction steps with newly recreated and empty tmp folders. I am using colab_search which calls mmseqs like search search_results/qdb db/uniref30_2103_db search_results/res search_results/tmp --num-iterations 3 --db-load-mode 2 -a -s 8 -e 0.1 --max-seqs 10000 --split 8. The query contains 4 amino acid sequences, and each has the length of 493 amino acid.

NOTE, when I took off --split 8, I also observed that mmseqs halts at certain stage.

MMseqs Output (for bugs)

search search_results/qdb db/uniref30_2103_db search_results/res search_results/tmp --num-iterations 3 --db-load-mode 2 -a -s 8 -e 0.1 --max-seqs 10000 --split 8        [93/1999]

MMseqs Version:                         b768f48f0bd73688b6a68132159a97f7b6310f03
Substitution matrix                     aa:blosum62.out,nucl:nucleotide.out
Add backtrace                           true
Alignment mode                          2
Alignment mode                          0
Allow wrapped scoring                   false
E-value threshold                       0.1
Seq. id. threshold                      0
Min alignment length                    0
Seq. id. mode                           0
Alternative alignments                  0
Coverage threshold                      0
Coverage mode                           0
Max sequence length                     65535
Compositional bias                      1
Max reject                              2147483647
Max accept                              2147483647
Include identical seq. id.              false
Preload mode                            2
Pseudo count a                          substitution:1.100,context:1.400
Pseudo count b                          substitution:4.100,context:5.800
Score bias                              0
Realign hits                            false
Realign score bias                      -0.2
Realign max seqs                        2147483647
Correlation score weight                0
Gap open cost                           aa:11,nucl:5
Gap extension cost                      aa:1,nucl:2
Zdrop                                   40
Threads                                 72
Compressed                              0
Verbosity                               3
Seed substitution matrix                aa:VTML80.out,nucl:nucleotide.out
Sensitivity                             8
k-mer length                            0
k-score                                 seq:2147483647,prof:2147483647
Alphabet size                           aa:21,nucl:5
Max results per query                   10000
Split database                          8
Split mode                              2
Split memory limit                      0
Diagonal scoring                        true
Exact k-mer matching                    0
Mask residues                           1
Exact k-mer matching                    0                                                                                                                                [49/1999]
Mask residues                           1
Mask residues probability               0.9
Mask lower case residues                0
Minimum diagonal score                  15
Spaced k-mers                           1
Spaced k-mer pattern
Local temporary path
Rescore mode                            0
Remove hits by seq. id. and coverage    false
Sort results                            0
Mask profile                            1
Profile E-value threshold               0.1
Global sequence weighting               false
Allow deletions                         false
Filter MSA                              1
Use filter only at N seqs               0
Maximum seq. id. threshold              0.9
Minimum seq. id.                        0.0
Minimum score per column                -20
Minimum coverage                        0
Select N most diverse seqs              1000
Pseudo count mode                       0
Gap pseudo count                        10
Min codons in orf                       30
Max codons in length                    32734
Max orf gaps                            2147483647
Contig start mode                       2
Contig end mode                         2
Orf start mode                          1
Forward frames                          1,2,3
Reverse frames                          1,2,3
Translation table                       1
Translate orf                           0
Use all table starts                    false
Offset of numeric ids                   0
Create lookup                           0
Add orf stop                            false
Overlap between sequences               0
Sequence split mode                     1
Header split mode                       0
Chain overlapping alignments            0
Merge query                             1
Search type                             0
Search iterations                       3
Start sensitivity                       4
Search iterations                       3                                                                                                                                 [5/1999]
Start sensitivity                       4
Search steps                            1
Exhaustive search mode                  false
Filter results during exhaustive search 0
Strand selection                        1
LCA search mode                         false
Disk space limit                        0
MPI runner
Force restart with latest tmp           false
Remove temporary files                  false

prefilter search_results/qdb db/uniref30_2103_db.idx search_results/tmp/12005814431969335264/pref_0 --sub-mat aa:blosum62.out,nucl:nucleotide.out --seed-sub-mat aa:VTML80.out,nuc
l:nucleotide.out -s 8 -k 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 10000 --split 8 --split-mode 2 --split-memory-limit 0
-c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kme
r-mode 1 --db-load-mode 2 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 72 --compressed 0 -v 3

Index version: 16
Generated by:  b768f48f0bd73688b6a68132159a97f7b6310f03
ScoreMatrix:  VTML80.out
Query database size: 190 type: Aminoacid
Estimated memory consumption: 148G
Target database size: 29291635 type: Aminoacid
Process prefiltering step 1 of 1

k-mer similarity threshold: 96
Starting prefiltering scores calculation (step 1 of 1)
Query db start 1 to 190
Target db start 1 to 29291635
^CTraceback (most recent call last):                              ] 37.57% 72 eta 0s

I had to stop it as mmseqs took hours without progress.

Context

I am quite puzzled what I should do to figure this out. The machine is located on our cluster, so there is enough disk space and memory. I tried to check the process status, and it is always in the D status with 100-200% CPU usage ( based on htop outputs). Not sure how I can speed things up at this stage.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

Git commit used (The string after "MMseqs Version:" when you execute MMseqs without any parameters): b768f48f0bd73688b6a68132159a97f7b6310f03
Which MMseqs version was used (Statically-compiled, self-compiled, Homebrew, etc.): self-complied
For self-compiled and Homebrew: Compiler and Cmake versions used and their invocation: gcc 6.1
Server specifications (especially CPU support for AVX2/SSE and amount of system memory): Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, support AVX2/SSE, total 503 G memory (free -g)
Operating system and version: Red Hat Enterprise Linux Server release 7.6 (Maipo)

zhanxw commented 2 years ago

The issue is probably related to file system. I will close for now.

zhanxw commented 2 years ago

I changed the --db-load-mode from 2 to 3, and the performance improves a lot. Where can I find the documentation on the option `--db-load-mode? Just want to understand this better.

martin-steinegger commented 2 years ago

Here you can read more about MMseqs2: https://github.com/soedinglab/MMseqs2/wiki

zhanxw commented 2 years ago

I read the wiki and User Guide. Although there are examples about --db-load-mode 2, none mentions or explains --db-load-mode 3.

AlvinLeopold commented 11 months ago

I think I encountered same question like you, and my HPC node similar with yours, it kept running almost 17h and no progress, I'm wondering that when you set the param --db-load-mode 3 then rerun it, how long could you detect the output?

Any anwser would be helpful! Thanks!

zhanxw commented 1 day ago

I read the wiki and User Guide. Although there are examples about --db-load-mode 2, none mentions or explains --db-load-mode 3.

This code explains: https://github.com/soedinglab/MMseqs2/blob/87e7103d289029dc3345f85ea9a4c4c6d6416e46/src/prefiltering/PrefilteringIndexReader.cpp#L385

Basically --db-load-mode 3 is the combination of --db-load-mode 2 and vmtouch, meaning mmseq will mmap and put the necessary data in the memory.

zhanxw commented 1 day ago

I think I encountered same question like you, and my HPC node similar with yours, it kept running almost 17h and no progress, I'm wondering that when you set the param --db-load-mode 3 then rerun it, how long could you detect the output?

Any anwser would be helpful! Thanks!

Hard to give a number. --db-load-mode 2 will halt indefinitely. --db-load-mode 3 at least can give results.

soedinglab / MMseqs2