soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
MIT License
1.41k stars 195 forks source link

MPI Split Memory Error #284

Open altaetran opened 4 years ago

altaetran commented 4 years ago

Expected Behavior

Providing --split-memory-limit keyword should mitigate memory requirements

Current Behavior

Works only with standard MMSeqs2 compiled version, but not with MPI compiled version.

Steps to Reproduce (for bugs)

Working on 60gb machine with 8cpus (each with 2 hyper threads) ${MMSeqs_bin}/mmseqs createDB in.fasta inDB ${MMSeqs_bin}/mmseqs cluster inDB cluDB tmp --min-seq-id 0.6 -c 0.7 -e 1e-10 --split-memory-limit 3G

Replacing {MMSeqs_bin} with {MMSeqs_non_MPI_bin} results in correct behavior without crashing

MMseqs Output (for bugs)

mmseqs linclust inDB cluDB tmp --min-seq-id 0.6 -c 0.7 -e 1e-10 
--split-memory-limit 500
linclust inDB cluDB tmp --min-seq-id 0.6 -c 0.7 -e 1e-10 --split-memory-limit 500
MMseqs Version:                         aa175d63658d9aa2e908325a6fd40e9dbb260c9a-MPI
Cluster mode                            0
Max connected component depth           1000
Similarity type                         2
Threads                                 16
Compressed                              0
Verbosity                               3
Substitution matrix                     nucl:nucleotide.out,aa:blosum62.out
Add backtrace                           false
Alignment mode                          2
Allow wrapped scoring                   false
E-value threshold                       1e-10
Seq. id. threshold                      0.6
Min alignment length                    0
Seq. id. mode                           0
Alternative alignments                  0
Coverage threshold                      0.7
Coverage mode                           0
Max sequence length                     65535
Compositional bias                      1
Realign hits                            false
Max reject                              2147483647
Max accept                              2147483647
Include identical seq. id.              false
Preload mode                            0
Pseudo count a                          1
Pseudo count b                          1.5
Score bias                              0
Gap open cost                           11
Gap extension cost                      1
Zdrop                                   40
Alphabet size                           nucl:5,aa:21
k-mers per sequence                     21
Spaced k-mers                           0
Scale k-mers per sequence               0
Adjust k-mer length                     false
Mask residues                           0
Mask lower case residues                0
k-mer length                            0
Shift hash                              67
Split memory limit                      500M
Include only extendable                 false
Skip repeating k-mers                   false
Rescore mode                            0
Remove hits by seq. id. and coverage    false
Sort results                            0
Remove temporary files                  false
Force restart with latest tmp           false
MPI runner
Set cluster mode SET COVER.
beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
(If you have multiple ICDs installed and OpenCL works, you can ignore this message)
MPI Init
Rank: 0 Size: 1
kmermatcher inDB tmp/9757835994511295515/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size nucl:5,aa:13 --min-seq-id 0.6 --kmer-per-seq 21 --spaced-kmer-
mode 0 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 0 -c 0.7 --max-seq-len 65535 --hash-shift 67 --split-memory-limit 500M 
--include-only-extendable 0 --ignore-multi-kmer 0 --threads 16 --compressed 0 -v 3
kmermatcher inDB tmp/9757835994511295515/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size nucl:5,aa:13 --min-seq-id 0.6 --kmer-per-seq 21 --spaced-kmer-
mode 0 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 0 -c 0.7 --max-seq-len 65535 --hash-shift 67 --split-memory-limit 500M 
--include-only-extendable 0 --ignore-multi-kmer 0 --threads 16 --compressed 0 -v 3
Database size: 62947276 type: Aminoacid
Reduced amino acid alphabet: (A S T) (C) (D B N) (E Q Z) (F Y) (G) (H) (I V) (K R) (L J M) (P) (W) (X)
Not enough memory to process at once need to split
[=================================================================] 100.00% 62.95M 2m 27s 641ms
Process file into 51 parts
Can not allocate memory
Error: kmermatcher died

However, for the non MPI version, it works fine. The only difference from normal behavior is possibly the top part of these errors:

Sequence 45461797 does not contain any sequence for key 62885869!
Sequence 45461798 does not contain any sequence for key 62886128!
[=================================================================] 100.00% 461.84K 2s 387ms
Sequence 45461799 does not contain any sequence for key 62888288!
Add missing connections
[=================================================================] 100.00% 45.46M 1s 608ms
Time for read in: 0h 0m 28s 571ms
Total time: 0h 0m 40s 655ms
Size of the sequence database: 45461840
Size of the alignment database: 45461840
Number of clusters: 44582196
Writing results 0h 0m 16s 390ms

Your Environment

Version with error: aa175d63658d9aa2e908325a6fd40e9dbb260c9a-MPI Version without error: 14014cd0ec50049f796f153ea8a72164ff4b8b45

Both are self compiled on the same operating system (Debian 9 Stretch).

uname -mrs gives Linux 4.19.0-0.bpo.6-amd64 x86_64

Running on 8 core 60gb machine (with 2 hyperthreads).

Thank you!

martin-steinegger commented 4 years ago

To use MMseqs2 MPI you need to call mmseqs2 using mpirun but there is no need to use MPI on a single machine. MMseqs2 will use all cores in default.

altaetran commented 4 years ago

I get the same type of error when using the parallel MPI version with "RUNNER=mpirun -np 8" . I only encountered this error while testing. I ideally would like to use a cluster configuration but will not be able to if this error persists.