Closed abiadak closed 7 years ago
Hello, I cannot reproduce the issue sadly. However I also don't have the NR available right now.
Could you please compile MMseqs again in debug mode:
cmake -DHAVE_MPI=0 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=. ..
(Also paste the cmake log please).
And run only the crashing prefilter with a debugger:
gdb --args PATH/TO/DEBUG/bin/mmseqs prefilter mmseq-testDB /junk/databases/mmseqs/nr /root/tmp/blast/tmp/pref_4 --sub-mat blosum62.out -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 32000 --max-seqs 300 --offset-result 0 --split 0 --split-mode 2 -c 0 --comp-bias-corr 1 --diag-score 1 --mask 1 --min-ungapped-score 15 --spaced-kmer-mode 1 --threads 64 -v 3 -s 4
Then type run to start mmseqs and when it crashes type bt for a backtrace. Please paste the backtrace here.
Edit: I am downloading the NR and will try to reproduce the issue using that too.
Hi, Thanks for your quick response. I've attached cmake.log and make.log
This is the output while running under gdb:
Use index /junk/databases/mmseqs/nr.sk7
Index version: 774909490
KmerSize: 7
AlphabetSize: 21
Skip: 0
Split: 0
Type: 1
Spaced: 1
Query database: mmseq-testDB(size=2467)
Target database: /junk/databases/mmseqs/nr(size=121935717)
Use kmer size 7 and split 0 using split mode 0
Program received signal SIGFPE, Arithmetic exception.
0x000000000052c00c in Prefiltering::estimateMemoryConsumption (split=0, dbSize=121935717, resSize=44976760168, maxHitsPerQuery=300, alphabetSize=21, kmerSize=7, threads=64)
at /root/MMseqs2/src/prefiltering/Prefiltering.cpp:910
910 size_t dbSizeSplit = (dbSize) / split;
And this is the backtrace:
#0 0x000000000052c00c in Prefiltering::estimateMemoryConsumption (split=0, dbSize=121935717, resSize=44976760168, maxHitsPerQuery=300, alphabetSize=21, kmerSize=7, threads=64)
at /root/MMseqs2/src/prefiltering/Prefiltering.cpp:910
#1 0x000000000052785c in Prefiltering::Prefiltering (this=0x857d90, queryDB="mmseq-testDB", queryDBIndex="mmseq-testDB.index", targetDB="/junk/databases/mmseqs/nr",
targetDBIndex="/junk/databases/mmseqs/nr.index", outDB="/root/tmp/blast/tmp/pref_4", outDBIndex="/root/tmp/blast/tmp/pref_4.index", par=...) at /root/MMseqs2/src/prefiltering/Prefiltering.cpp:154
#2 0x000000000052679e in prefilter (argc=39, argv=0x7fffffffdff8, command=...) at /root/MMseqs2/src/prefiltering/Main.cpp:38
#3 0x00000000004bf4e0 in runCommand (p=..., argc=39, argv=0x7fffffffdff8) at /root/MMseqs2/src/mmseqs.cpp:330
#4 0x00000000004bf82c in main (argc=41, argv=0x7fffffffdfe8) at /root/MMseqs2/src/mmseqs.cpp:381
Okay there is the problem, it choses (for whatever some reason) --split 0. As a workaround, you can create an index with --split 1 and then also do the search with --split 1.
I will investigate why createindex is storing the wrong split value.
I see whats wrong, the issue should already be fixed in my development branch. I hope can merge it soon back into the stable branch.
Thanks Milot, the --split 1 workaround has worked, specifiying it at index creation time and search time, although it complains about not having enough memory, it finishes:
Initialising data structures...
Using 64 threads.
Use index /junk/databases/mmseqs/nr.sk7
Index version: 774909490
KmerSize: 7
AlphabetSize: 21
Skip: 0
Split: 1
Type: 1
Spaced: 1
Query database: mmseq-testDB(size=2467)
Target database: /junk/databases/mmseqs/nr(size=121935717)
Use kmer size 7 and split 1 using split mode 0
Needed memory (699399167230 byte) of total memory (541058265088 byte)
WARNING: MMseqs processes needs more main memory than available.Increase the size o f --split or set it to 0 to automatic optimize target database split.
WARNING: Split has to be computed by createindex if precomputed index is used.
Substitution matrices...
Time for init: 0 h 0 m 23s
The index file size is around 300GB, so it's asking more than double.
then do --split 2 or --split 3, that will bring the memory requirements down to half or a third, at a modest loss of search speed.
In this case with 512GB of RAM, and NR with it's current size (~69GB of raw sequences), --split 2 and 32 threads is what gives best performance results. Reducing the number of database chunks doesn't allow to keep the index file in cache when 'mmseqs search' is running, what forces reading from the file system, increasing the sys CPU time and degrading performance. Using 62 threads (32 real cores plus 32 from HT), looks like that puts too much pressure on available memory bandwith and hurts performance too (in my tests, it goes from 4' to 4'30'')
Yes I also made this observation. The HT cores hurt the perfilter performance of mmseqs.
I confirm I faced the same issue, which was solved by adding an explicit --split. Can you confirm that the strategy for --split is that the size of the .sk7 index divided by the split value should fit into RAM ? Many thanks, Pascal PS: On a compute cluster, I'm finding that using a 240GB sk7 precomputed index on a 40Gb/s Infiniband network drive is hardly providing any speed-up compared to "on the fly indexing" because it takes so long to cache the index on the nodes (who are naturally all asking the same huge file at the same time...). Sadly our cluster admin policy is to not have any permanent local disk space on the nodes, as the Infiniband network disks are supposedly so fast :( I'm going to have to try and convince the admin of the opposite! In the mean time I'm getting much better performance on single beefed up work stations with SSD storage.
Hi Pascal, I forgot a bit about this issue. We recently improved the set-up time of the on-the-fly indices quite substantially and we also dropped support for split precomputed indices. If you don't have enough system memory you should just use the on-the-fly index.
The original issue should also be fixed with the merge I just pushed. So the workaround should not be needed anymore.
Best regards,
Milot
Expected Behavior
Obtaining similar sequences to the queries from the target database
Current Behavior
Floating point exception at the prefilter step
Steps to Reproduce (for bugs)
Please make sure to execute the reproduction steps with newly recreated and empty tmp folders. The target database is current nr (protein) from the NCBI (~120M sequences, 69GB). Index creation runs ok:
mmseqs createindex nr
ls -lrt
When launching the search:
mmseqs search mmseq-testDB /junk/databases/mmseqs/nr test-2-mmseqsDB tmp
The index creation and search is done in the same machine
MMseqs Output (for bugs)
Please make sure to also post the complete output of MMseqs. You can use gist.github.com for large output.
Context
Providing context helps us come up with a solution and improve our documentation for the future.
Your Environment
Include as many relevant details about the environment you experienced the bug in.
self-compiled
cmake --version cmake version 2.8.12.2 cmake -DHAVE_MPI=0 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=. .. c++ --version c++ (GCC) 6.2.1 20160916 (Red Hat 6.2.1-3)
free