soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.39k stars 195 forks source link

Floating point exception running mmseqs search with index #31

Closed abiadak closed 7 years ago

abiadak commented 7 years ago

Expected Behavior

Obtaining similar sequences to the queries from the target database

Current Behavior

Floating point exception at the prefilter step

Steps to Reproduce (for bugs)

Please make sure to execute the reproduction steps with newly recreated and empty tmp folders. The target database is current nr (protein) from the NCBI (~120M sequences, 69GB). Index creation runs ok:

mmseqs createindex nr

Program call:
nr 

MMseqs Version:         a81227565da4e95d233e3bcbd5c0cdc6ada1c14a
Sub Matrix              blosum62.out
K-mer size              0
Alphabet size           21
Max. sequence length    32000
Mask Residues           1
Split DB                0
Spaced Kmer             1
Threads                 64
Verbosity               3
...
Write MMSEQSFFINDEX 
Time for merging files: 0 h 0 m 0 s
Done. 

ls -lrt

-rw-r--r--. 1 root root   2773738984 may 11 14:05 nr.lookup
-rw-r--r--. 1 root root  28462541941 may 11 14:07 nr_h
-rw-r--r--. 1 root root   2967783911 may 11 14:07 nr_h.index
-rw-r--r--. 1 root root  44976760168 may 11 14:10 nr
-rw-r--r--. 1 root root   3020702058 may 11 14:10 nr.index
drwxr-xr-x. 2 root root            6 may 12 12:52 tmp
-rw-r--r--. 1 root root   3020702058 may 12 13:23 nr.sk7.mmseqsindex
-rw-r--r--. 1 root root 330684926197 may 12 13:23 nr.sk7
-rw-r--r--. 1 root root          344 may 12 13:23 nr.sk7.index

When launching the search:

mmseqs search mmseq-testDB /junk/databases/mmseqs/nr test-2-mmseqsDB tmp

Program call:
mmseq-testDB /junk/databases/mmseqs/nr test-2-mmseqsDB tmp 

MMseqs Version:                     a81227565da4e95d233e3bcbd5c0cdc6ada1c14a
Sub Matrix                          blosum62.out
Add backtrace                       false
Alignment mode                      0
E-value threshold                   0.001
Seq. Id Threshold                   0
Coverage threshold                  0
Target Coverage threshold           0
Max. sequence length                32000
Max. results per query              300
Compositional bias                  1
Query queryProfile                  false
Realign hit                         false
Max Reject                          2147483647
Max Accept                          2147483647
Include identical Seq. Id.          false
Threads                             64
Verbosity                           3
Sensitivity                         4
K-mer size                          0
K-score                             2147483647
Alphabet size                       21
Target queryProfile                 false
Offset result                       0
Split DB                            0
Split mode                          2
Diagonal Scoring                    1
Mask Residues                       1
Minimum Diagonal score              15
Spaced Kmer                         1
Profile e-value threshold           0.001
Use global sequence weighting       false
Maximum sequence identity threshold 0.9
Minimum seq. id.                    0
Minimum score per column            -20
Minimum coverage                    0
Select n most diverse seqs          1000
Pseudo count a                      1
Pseudo count b                      1.5
Number search iterations            1
Start sensitivity                   4
sensitivity step size               1
Sets the MPI runner                 
Remove Temporary Files              false

/root/tmp/blast
/root/tmp/blast
Program call:
mmseq-testDB /junk/databases/mmseqs/nr /root/tmp/blast/tmp/pref_4 --sub-mat blosum62.out -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 32000 --max-seqs 300 --offset-result 0 --split 0 --split-mode 2 -c 0 --comp-bias-corr 1 --diag-score 1 --mask 1 --min-ungapped-score 15 --spaced-kmer-mode 1 --threads 64 -v 3 -s 4 

MMseqs Version:             a81227565da4e95d233e3bcbd5c0cdc6ada1c14a
Sub Matrix                  blosum62.out
Sensitivity                 4
K-mer size                  0
K-score                     2147483647
Alphabet size               21
Max. sequence length        32000
Query queryProfile          false
Target queryProfile         false
Max. results per query      300
Offset result               0
Split DB                    0
Split mode                  2
Coverage threshold          0
Compositional bias          1
Diagonal Scoring            1
Mask Residues               1
Minimum Diagonal score      15
Include identical Seq. Id.  false
Spaced Kmer                 1
Threads                     64
Verbosity                   3

Initialising data structures...
Using 64 threads.

Use index  /junk/databases/mmseqs/nr.sk7
Index version: 774909490
KmerSize:     7
AlphabetSize: 21
Skip:         0
Split:        0
Type:         1
Spaced:       1
Query database: mmseq-testDB(size=2467)
Target database: /junk/databases/mmseqs/nr(size=121935717)
Use kmer size 7 and split 0 using split mode 0
tmp/blastp.sh: línea 60: 68389 Excepción de coma flotante   $RUNNER $MMSEQS prefilter "$INPUT" "$TARGET_DB_PREF" "$TMP_PATH/pref_$SENS" $PREFILTER_PAR -s $SENS
Program call:
mmseq-testDB /junk/databases/mmseqs/nr /root/tmp/blast/tmp/pref_4 /root/tmp/blast/tmp/aln_4 --sub-mat blosum62.out --alignment-mode 0 -e 0.001 --min-seq-id 0 -c 0 --target-cov 0 --max-seq-len 32000 --max-seqs 300 --comp-bias-corr 1 --max-rejected 2147483647 --max-accept 2147483647 --threads 64 -v 3 

MMseqs Version:             a81227565da4e95d233e3bcbd5c0cdc6ada1c14a
Sub Matrix                  blosum62.out
Add backtrace               false
Alignment mode              0
E-value threshold           0.001
Seq. Id Threshold           0
Coverage threshold          0
Target Coverage threshold   0
Max. sequence length        32000
Max. results per query      300
Compositional bias          1
Query queryProfile          false
Realign hit                 false
Max Reject                  2147483647
Max Accept                  2147483647
Include identical Seq. Id.  false
Threads                     64
Verbosity                   3

Init data structures...
Compute score only.
Using 64 threads.
Could not open data file /root/tmp/blast/tmp/pref_4!
mv: no se puede efectuar `stat' sobre «/root/tmp/blast/tmp/aln_4»: No existe el fichero o el directorio

The index creation and search is done in the same machine

MMseqs Output (for bugs)

Please make sure to also post the complete output of MMseqs. You can use gist.github.com for large output.

Context

Providing context helps us come up with a solution and improve our documentation for the future.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

self-compiled

cmake --version cmake version 2.8.12.2 cmake -DHAVE_MPI=0 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=. .. c++ --version c++ (GCC) 6.2.1 20160916 (Red Hat 6.2.1-3)

processor   : 63
vendor_id   : GenuineIntel
cpu family  : 6
model       : 46
model name  : Intel(R) Xeon(R) CPU           X7560  @ 2.27GHz
stepping    : 6
microcode   : 0xb
cpu MHz     : 1064.000
cache size  : 24576 KB
physical id : 3
siblings    : 16
core id     : 11
cpu cores   : 8
apicid      : 119
initial apicid  : 119
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt lahf_lm ida epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips    : 4527.83
clflush size    : 64
cache_alignment : 64
address sizes   : 44 bits physical, 48 bits virtual
power management:

free

              total        used        free      shared  buff/cache   available
Mem:      528377212     3193792   142947764        9564   382235656   523907652
Swap:             0           0           0
milot-mirdita commented 7 years ago

Hello, I cannot reproduce the issue sadly. However I also don't have the NR available right now.

Could you please compile MMseqs again in debug mode: cmake -DHAVE_MPI=0 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=. .. (Also paste the cmake log please).

And run only the crashing prefilter with a debugger: gdb --args PATH/TO/DEBUG/bin/mmseqs prefilter mmseq-testDB /junk/databases/mmseqs/nr /root/tmp/blast/tmp/pref_4 --sub-mat blosum62.out -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 32000 --max-seqs 300 --offset-result 0 --split 0 --split-mode 2 -c 0 --comp-bias-corr 1 --diag-score 1 --mask 1 --min-ungapped-score 15 --spaced-kmer-mode 1 --threads 64 -v 3 -s 4

Then type run to start mmseqs and when it crashes type bt for a backtrace. Please paste the backtrace here.

Edit: I am downloading the NR and will try to reproduce the issue using that too.

abiadak commented 7 years ago

Hi, Thanks for your quick response. I've attached cmake.log and make.log

This is the output while running under gdb:

Use index  /junk/databases/mmseqs/nr.sk7
Index version: 774909490
KmerSize:     7
AlphabetSize: 21
Skip:         0
Split:        0
Type:         1
Spaced:       1
Query database: mmseq-testDB(size=2467)
Target database: /junk/databases/mmseqs/nr(size=121935717)
Use kmer size 7 and split 0 using split mode 0

Program received signal SIGFPE, Arithmetic exception.
0x000000000052c00c in Prefiltering::estimateMemoryConsumption (split=0, dbSize=121935717, resSize=44976760168, maxHitsPerQuery=300, alphabetSize=21, kmerSize=7, threads=64)
    at /root/MMseqs2/src/prefiltering/Prefiltering.cpp:910
910     size_t dbSizeSplit = (dbSize) / split;

And this is the backtrace:

#0  0x000000000052c00c in Prefiltering::estimateMemoryConsumption (split=0, dbSize=121935717, resSize=44976760168, maxHitsPerQuery=300, alphabetSize=21, kmerSize=7, threads=64)
    at /root/MMseqs2/src/prefiltering/Prefiltering.cpp:910
#1  0x000000000052785c in Prefiltering::Prefiltering (this=0x857d90, queryDB="mmseq-testDB", queryDBIndex="mmseq-testDB.index", targetDB="/junk/databases/mmseqs/nr", 
    targetDBIndex="/junk/databases/mmseqs/nr.index", outDB="/root/tmp/blast/tmp/pref_4", outDBIndex="/root/tmp/blast/tmp/pref_4.index", par=...) at /root/MMseqs2/src/prefiltering/Prefiltering.cpp:154
#2  0x000000000052679e in prefilter (argc=39, argv=0x7fffffffdff8, command=...) at /root/MMseqs2/src/prefiltering/Main.cpp:38
#3  0x00000000004bf4e0 in runCommand (p=..., argc=39, argv=0x7fffffffdff8) at /root/MMseqs2/src/mmseqs.cpp:330
#4  0x00000000004bf82c in main (argc=41, argv=0x7fffffffdfe8) at /root/MMseqs2/src/mmseqs.cpp:381

make.log.txt cmake.log.txt

milot-mirdita commented 7 years ago

Okay there is the problem, it choses (for whatever some reason) --split 0. As a workaround, you can create an index with --split 1 and then also do the search with --split 1.

I will investigate why createindex is storing the wrong split value.

milot-mirdita commented 7 years ago

I see whats wrong, the issue should already be fixed in my development branch. I hope can merge it soon back into the stable branch.

abiadak commented 7 years ago

Thanks Milot, the --split 1 workaround has worked, specifiying it at index creation time and search time, although it complains about not having enough memory, it finishes:

Initialising data structures...
Using 64 threads.

Use index  /junk/databases/mmseqs/nr.sk7
Index version: 774909490
KmerSize:     7
AlphabetSize: 21
Skip:         0
Split:        1
Type:         1
Spaced:       1
Query database: mmseq-testDB(size=2467)
Target database: /junk/databases/mmseqs/nr(size=121935717)
Use kmer size 7 and split 1 using split mode 0
Needed memory (699399167230 byte) of total memory (541058265088 byte)
WARNING: MMseqs processes needs more main memory than available.Increase the size o                                                                                                                                              f --split or set it to 0 to automatic optimize target database split.
WARNING: Split has to be computed by createindex if precomputed index is used.
Substitution matrices...
Time for init: 0 h 0 m 23s

The index file size is around 300GB, so it's asking more than double.

milot-mirdita commented 7 years ago

then do --split 2 or --split 3, that will bring the memory requirements down to half or a third, at a modest loss of search speed.

abiadak commented 7 years ago

In this case with 512GB of RAM, and NR with it's current size (~69GB of raw sequences), --split 2 and 32 threads is what gives best performance results. Reducing the number of database chunks doesn't allow to keep the index file in cache when 'mmseqs search' is running, what forces reading from the file system, increasing the sys CPU time and degrading performance. Using 62 threads (32 real cores plus 32 from HT), looks like that puts too much pressure on available memory bandwith and hurts performance too (in my tests, it goes from 4' to 4'30'')

martin-steinegger commented 7 years ago

Yes I also made this observation. The HT cores hurt the perfilter performance of mmseqs.

Zaphod-dev commented 7 years ago

I confirm I faced the same issue, which was solved by adding an explicit --split. Can you confirm that the strategy for --split is that the size of the .sk7 index divided by the split value should fit into RAM ? Many thanks, Pascal PS: On a compute cluster, I'm finding that using a 240GB sk7 precomputed index on a 40Gb/s Infiniband network drive is hardly providing any speed-up compared to "on the fly indexing" because it takes so long to cache the index on the nodes (who are naturally all asking the same huge file at the same time...). Sadly our cluster admin policy is to not have any permanent local disk space on the nodes, as the Infiniband network disks are supposedly so fast :( I'm going to have to try and convince the admin of the opposite! In the mean time I'm getting much better performance on single beefed up work stations with SSD storage.

milot-mirdita commented 7 years ago

Hi Pascal, I forgot a bit about this issue. We recently improved the set-up time of the on-the-fly indices quite substantially and we also dropped support for split precomputed indices. If you don't have enough system memory you should just use the on-the-fly index.

The original issue should also be fixed with the merge I just pushed. So the workaround should not be needed anymore.

Best regards,

Milot