soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.39k stars 195 forks source link

tempfile issue #11

Closed alexander-nord closed 7 years ago

alexander-nord commented 7 years ago

I've been trying to perform some basic tests with MMSeqs2 and have encountered an issue where I repeated get the following error message:

 Init data structures...
 Compute score only.
 Could not open data file [path_to_dir]/mmseqs_tmp/pref_4!

The "[path_to_dir]/mmseqs_tmp/" directory contains two temporary files (pref_4.index_tmp_0.0 and pref_4_tmp_0.0) along with a blastp.sh script.

I'm not using any advanced options for my search, and both input databases are (as far as I can see) formatted correctly. Maybe I'm overlooking something embarrassingly simple?

Thanks!

milot-mirdita commented 7 years ago

Could you post the whole log?

It looks like the prefilter step failed for some reason and the next step is now complaining about that.

alexander-nord commented 7 years ago

Here's the full output (filepaths slightly redacted, for readability):

$ mmseqs search Candidates_db trans_small_db Trials/mmseqs/Candidates_to_Small mmseqs_tmp/ -v 3 Program call: Candidates_db trans_small_db Trials/mmseqs/Candidates_to_Small mmseqs_tmp/ -v 3

MMseqs Version: 8bd3de3e35c91c0723517a964efc3223682c3bb5 Sub Matrix blosum62.out Add backtrace false Alignment mode 0 E-value threshold 0.001 Seq. Id Threshold 0 Coverage threshold 0 Max. sequence length 32000 Max. results per query 300 Compositional bias 1 Profile false Realign hit false Max Reject 2147483647 Max Accept 2147483647 Include identical Seq. Id. false Threads 1 Verbosity 3 Sensitivity 4 K-mer size 0 K-score 2147483647 Alphabet size 21 Offset result 0 Split DB 0 Split mode 2 Diagonal Scoring 1 Minimum Diagonal score 15 Spaced Kmer 1 Profile e-value threshold 0.001 Use global sequence weighting false Maximum sequence identity threshold 0.9 Minimum seq. id. 0 Minimum score per column -20 Minimum coverage 0 Select n most diverse seqs 100 Pseudo count a 1 Pseudo count b 1.5 Number search iterations 1 Start sensitivity 4 sensitivity step size 1 Sets the MPI runner

[pwd] [pwd] Program call: Candidates_db trans_small_db [pwd]/mmseqs_tmp/pref_4 --sub-mat blosum62.out -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 32000 --max-seqs 300 --offset-result 0 --split 0 --split-mode 2 -c 0 --comp-bias-corr 1 --diag-score 1 --min-ungapped-score 15 --spaced-kmer-mode 1 --threads 1 -v 3 -s 4

MMseqs Version: 8bd3de3e35c91c0723517a964efc3223682c3bb5 Sub Matrix blosum62.out Sensitivity 4 K-mer size 0 K-score 2147483647 Alphabet size 21 Max. sequence length 32000 Profile false Max. results per query 300 Offset result 0 Split DB 0 Split mode 2 Coverage threshold 0 Compositional bias 1 Diagonal Scoring 1 Minimum Diagonal score 15 Include identical Seq. Id. false Spaced Kmer 1 Threads 1 Verbosity 3

Initialising data structures...

Cound not find precomputed index. Compute index. Query database: Candidates_db(size=6144) Target database: trans_small_db(size=137914) Use kmer size 6 and split 1 using split mode 0 Needed memory (1556507147 byte) of total memory (4294967296 byte) Substitution matrices... Time for init: 0 h 0 m 7s

Process prefiltering step 0 of 1

[pwd]/mmseqs_tmp/pref_4_tmp_0.0: File exists Program call: Candidates_db trans_small_db [pwd]/mmseqs_tmp/pref_4 [pwd]/mmseqs_tmp/aln_4 --sub-mat blosum62.out --alignment-mode 0 -e 0.001 --min-seq-id 0 -c 0 --max-seq-len 32000 --max-seqs 300 --comp-bias-corr 1 --max-rejected 2147483647 --max-accept 2147483647 --threads 1 -v 3

MMseqs Version: 8bd3de3e35c91c0723517a964efc3223682c3bb5 Sub Matrix blosum62.out Add backtrace false Alignment mode 0 E-value threshold 0.001 Seq. Id Threshold 0 Coverage threshold 0 Max. sequence length 32000 Max. results per query 300 Compositional bias 1 Profile false Realign hit false Max Reject 2147483647 Max Accept 2147483647 Include identical Seq. Id. false Threads 1 Verbosity 3

Init data structures... Compute score only. Could not open data file [pwd]/mmseqs_tmp/pref_4! mv: rename [pwd]/mmseqs_tmp/aln_4 to Trials/mmseqs/Candidates_to_Small: No such file or directory

milot-mirdita commented 7 years ago

The prefilter results from the previous run are obscuring the real error. Could you empty the tmp folder and start again?

This part: "[pwd]/mmseqs_tmp/pref_4_tmp_0.0: File exists"

alexander-nord commented 7 years ago

Ah! I had assumed that the new run would overwrite the existing files. I just removed and remade the temporary directory and things are running smoothly.

Thanks for the help!

milot-mirdita commented 7 years ago

No problem. I know that the errors are not very clear. We'll try to improve that. :)