soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.39k stars 195 forks source link

shebang line in blasp.sh #1

Closed kad-ecoli closed 8 years ago

kad-ecoli commented 8 years ago

when I run the mmseqs2 to search sequence, I encounter a an error at blastp.sh:

$ mmseqs createdb queryDB.fasta queryDB $ mmseqs createdb targetDB.fasta targetDB $ mmseqs createindex targetDB $ mkdir -p tmp/ $ mmseqs search queryDB targetDB resultDB tmp --use-index

Program call: queryDB targetDB resultDB tmp --use-index

MMseqs Version: ef19bf40b8f6f5151d5ecbab42cfcb903facb907 Sub Matrix /home/zcx/Program/MMseqs/2.0/data/blosum62.out Alignment mode 0 E-value threshold 0.001 Coverage threshold 0 Detect fragments false Compositional bias 1 Seq. Id Threshold 0 Max. sequence length 32000 Max. results per query 300 Max Reject 2147483647 Include identical Seq. Id. false Nucleotide false Profile false Add backtrace false Realign hit false Threads 32 Verbosity 3 Sensitivity 4 K-mer size 7 K-score 2147483647 Alphabet size 21 Split DB 0 Split mode 2 Search mode 2 Diagonal Scoring 1 Minimum Diagonal score 30 Spaced Kmer 1 Profile e-value threshold 0.001 Use global sequence weighting false Maximum sequence identity threshold 0.9 Minimum seq. id. 0 Minimum score per column -20 Minimum coverage 0 Select n most diverse seqs 100 Pseudo count a 1 Pseudo count b 1.5 First sequence as respresentative false Number search iterations 1 Start sensitivity 4 Sensitivity step size 1 Use index true Sets the MPI runner

MMseqs Version: ef19bf40b8f6f5151d5ecbab42cfcb903facb907 Sub Matrix /home/zcx/Program/MMseqs/2.0/data/blosum62.out Sensitivity 4 K-mer size 7 K-score 2147483647 Alphabet size 21 Max. sequence length 32000 Profile false Nucleotide false Max. results per query 300 Split DB 0 Split mode 2 Search mode 2 Compositional bias 1 Diagonal Scoring 1 Minimum Diagonal score 30 Include identical Seq. Id. false Spaced Kmer 1 Threads 32 Verbosity 3

Initialising data structures... Using 32 threads.

Index version: 774909490 KmerSize: 7 AlphabetSize: 21 Skip: 0 Split: 1 Type: 1 Spaced: 1 Query database: queryDB(size=246) Target database: targetDB.sk7(size=10000) Needed memory (14434761936 byte) of total memory (270462795776 byte) Substitution matrices... Time for init: 0 h 0 m 3s

Process prefiltering step 0 of 1

Index version: 774909490 KmerSize: 7 AlphabetSize: 21 Skip: 0 Split: 1 Type: 1 Spaced: 1 Copy 1650981 Entries (9905886 byte) Setup Sizes
Read IndexTable ... Done k-mer similarity threshold: 115 k-mer match probability: 0

Starting prefiltering scores calculation (step 0 of 1) Query db start 0 to 246 Target db start 0 to 10000

736 k-mers per position. 448 DB matches per sequence. 553 Double diagonal matches per sequence. 0 Overflows . 25 sequences passed prefiltering per query sequence. Median result list size: 21 5 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0 h 2 m 18s Time for merging files: 0 h 0 m 0 s

Overall time for prefiltering run: 0 h 2 m 32s

MMseqs Version: ef19bf40b8f6f5151d5ecbab42cfcb903facb907 Sub Matrix /home/zcx/Program/MMseqs/2.0/data/blosum62.out Alignment mode 0 E-value threshold 0.001 Coverage threshold 0 Detect fragments false Compositional bias 1 Seq. Id Threshold 0 Max. sequence length 32000 Max. results per query 300 Max Reject 2147483647 Include identical Seq. Id. false Nucleotide false Profile false Add backtrace false Realign hit false Threads 32 Verbosity 3

Init data structures... Compute score only. Using 32 threads. Calculation of Smith-Waterman alignments. Time for merging files: 0 h 0 m 0 s

All sequences processed.

6287 alignments calculated. 6203 sequence pairs passed the thresholds (0.986639 of overall calculated). 25.2154 hits per query sequence. Time for alignments calculation: 0 h 0 m 1s

I am running MMseqs2 on Ubuntu 14.04 (trusty) x86-64. On Ubuntu and Debian, the default shell /bin/sh is dash, not bash. dash does not support "let". I recommend changing the first line of "blastp.sh" from "#!/bin/sh -ex" to "#!/bin/bash -ex"

kad-ecoli commented 8 years ago

After I fix the above mentioned bug, my output is like this: $ mmseqs search queryDB targetDB resultDB tmp --use-index

Program call: queryDB targetDB resultDB tmp --use-index

MMseqs Version: ef19bf40b8f6f5151d5ecbab42cfcb903facb907 Sub Matrix /home/zcx/Program/MMseqs/2.0/data/blosum62.out Alignment mode 0 E-value threshold 0.001 Coverage threshold 0 Detect fragments false Compositional bias 1 Seq. Id Threshold 0 Max. sequence length 32000 Max. results per query 300 Max Reject 2147483647 Include identical Seq. Id. false Nucleotide false Profile false Add backtrace false Realign hit false Threads 32 Verbosity 3 Sensitivity 4 K-mer size 7 K-score 2147483647 Alphabet size 21 Split DB 0 Split mode 2 Search mode 2 Diagonal Scoring 1 Minimum Diagonal score 30 Spaced Kmer 1 Profile e-value threshold 0.001 Use global sequence weighting false Maximum sequence identity threshold 0.9 Minimum seq. id. 0 Minimum score per column -20 Minimum coverage 0 Select n most diverse seqs 100 Pseudo count a 1 Pseudo count b 1.5 First sequence as respresentative false Number search iterations 1 Start sensitivity 4 Sensitivity step size 1 Use index true Sets the MPI runner

MMseqs Version: ef19bf40b8f6f5151d5ecbab42cfcb903facb907 Sub Matrix /home/zcx/Program/MMseqs/2.0/data/blosum62.out Sensitivity 4 K-mer size 7 K-score 2147483647 Alphabet size 21 Max. sequence length 32000 Profile false Nucleotide false Max. results per query 300 Split DB 0 Split mode 2 Search mode 2 Compositional bias 1 Diagonal Scoring 1 Minimum Diagonal score 30 Include identical Seq. Id. false Spaced Kmer 1 Threads 32 Verbosity 3

Initialising data structures... Using 32 threads.

Index version: 774909490 KmerSize: 7 AlphabetSize: 21 Skip: 0 Split: 1 Type: 1 Spaced: 1 Query database: queryDB(size=246) Target database: targetDB.sk7(size=10000) Needed memory (14434761936 byte) of total memory (270462795776 byte) Substitution matrices... Time for init: 0 h 0 m 3s

Process prefiltering step 0 of 1

Index version: 774909490 KmerSize: 7 AlphabetSize: 21 Skip: 0 Split: 1 Type: 1 Spaced: 1 Copy 1650981 Entries (9905886 byte) Setup Sizes
Read IndexTable ... Done k-mer similarity threshold: 115 k-mer match probability: 0

qStarting prefiltering scores calculation (step 0 of 1) Query db start 0 to 246 Target db start 0 to 10000

736 k-mers per position. 448 DB matches per sequence. 553 Double diagonal matches per sequence. 0 Overflows . 25 sequences passed prefiltering per query sequence. Median result list size: 21 5 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0 h 2 m 8s Time for merging files: 0 h 0 m 0 s

Overall time for prefiltering run: 0 h 2 m 22s

MMseqs Version: ef19bf40b8f6f5151d5ecbab42cfcb903facb907 Sub Matrix /home/zcx/Program/MMseqs/2.0/data/blosum62.out Alignment mode 0 E-value threshold 0.001 Coverage threshold 0 Detect fragments false Compositional bias 1 Seq. Id Threshold 0 Max. sequence length 32000 Max. results per query 300 Max Reject 2147483647 Include identical Seq. Id. false Nucleotide false Profile false Add backtrace false Realign hit false Threads 32 Verbosity 3

Init data structures... Compute score only. Using 32 threads. Calculation of Smith-Waterman alignments. Time for merging files: 0 h 0 m 0 s

All sequences processed.

6287 alignments calculated. 6203 sequence pairs passed the thresholds (0.986639 of overall calculated). 25.2154 hits per query sequence. Time for alignments calculation: 0 h 0 m 1s

Why was there an 'Could not move result to resultDB' error code even though the "resultDB" and "resultDB.index" file was correctly generated?

martin-steinegger commented 8 years ago

Thank you for analyzing the problem. This helps me a lot. I changed the shebang line to bash.

Your run worked just fine. The output is "debug" output which occures becaues of the -x parameter at the shebang line. "#!/bin/bash -ex"

I removed this flag and updated the tar file. You can download it here http://github.com/soedinglab/mmseqs2/raw/master/mmseqs-static.tar.gz

martin-steinegger commented 8 years ago

No feedback. I assume its fixed now.