steineggerlab / foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.
https://foldseek.com
GNU General Public License v3.0
696 stars 92 forks source link

convertalis issue on `--cluster-search 1` #179

Closed YoshitakaMo closed 9 months ago

YoshitakaMo commented 10 months ago

Current Behavior

When I downloaded afdb50 on my computer and performed the cluster search for a pdb file job.pdb with the flags --alignment-type 2 --max-seqs 1000 -e 10 -s 9.5 --prefilter-mode 1 --cluster-search 1, I got an error message at the final foldseek convertalis step. The stdout log is:

easy-search job.pdb /mnt/foldseek/afdb50 result.html tmp --alignment-type 2 --max-seqs 1000 -e 10 -s 9.5 --prefilter-mode 1 --cluster-search 1 --tmscore-threshold 0.3 --format-mode 3 

MMseqs Version:                 96be67cfedf1491b3280c169714eabf207dbf796
Seq. id. threshold              0
Coverage threshold              0
Coverage mode                   0
Max reject                      2147483647
Max accept                      2147483647
... ...

Time for processing: 0h 0m 0s 35ms
Removing temporary files
rmdb tmp/11281840674382596303/search_tmp/4215546465989568500/strualn_expanded -v 3 

Time for processing: 0h 0m 0s 30ms
rmdb tmp/11281840674382596303/search_tmp/4215546465989568500/pref -v 3 

Time for processing: 0h 0m 0s 4ms
convertalis tmp/11281840674382596303/query /mnt/foldseek/afdb50_seq tmp/11281840674382596303/result result.html --sub-mat 'aa:3di.out,nucl:3di.out' --format-mode 3 --format-output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits --translation-table 1 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --db-output 0 --db-load-mode 0 --search-type 0 --threads 32 --compressed 0 -v 3 

Error: Convert Alignments died

This issue occurs when I add --cluster-search 1. Without the option, the search finished without the error:

easy-search job.pdb /mnt/foldseek/afdb50 result2.html tmp --alignment-type 2 --max-seqs 1000 -e 10 -s 9.5 --prefilter-mode 1 --tmscore-threshold 0.3 --format-mode 3 

MMseqs Version:                 96be67cfedf1491b3280c169714eabf207dbf796
Seq. id. threshold              0

...

Time for processing: 0h 0m 0s 86ms
Removing temporary files
rmdb tmp/8649837317013178876/search_tmp/7511585914143165062/pref -v 3 

Time for processing: 0h 0m 0s 3ms
convertalis tmp/8649837317013178876/query /mnt/foldseek/afdb50 tmp/8649837317013178876/result result2.html --sub-mat 'aa:3di.out,nucl:3di.out' --format-mode 3 --format-output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits --translation-table 1 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --db-output 0 --db-load-mode 0 --search-type 0 --threads 32 --compressed 0 -v 3 

[=================================================================] 1 0s 0ms
Time for merging to result2.html: 0h 0m 0s 3ms
Time for processing: 0h 0m 8s 471ms
rmdb tmp/8649837317013178876/result -v 3 

Time for processing: 0h 0m 0s 13ms
rmdb tmp/8649837317013178876/query -v 3 

Time for processing: 0h 0m 0s 3ms
rmdb tmp/8649837317013178876/query_h -v 3 

Time for processing: 0h 0m 0s 2ms
rmdb tmp/8649837317013178876/query_ca -v 3 

Time for processing: 0h 0m 0s 2ms
rmdb tmp/8649837317013178876/query_ss -v 3 

Time for processing: 0h 0m 0s 3ms

I guess the issue is that with the option, it comes with a _seq suffix for the database afdb50on the process of foldseek convertalis. Hence, I could obtain the result file when I executed this command after the error :

- foldseek convertalis tmp/11281840674382596303/query /mnt/foldseek/afdb50_seq tmp/11281840674382596303/result result.html --sub-mat 'aa:3di.out,nucl:3di.out' --format-mode 3 --format-output 
+ foldseek convertalis tmp/11281840674382596303/query /mnt/foldseek/afdb50 tmp/11281840674382596303/result result.html --sub-mat 'aa:3di.out,nucl:3di.out' --format-mode 3 --format-output 

I hope this issue will be fixed.

Expected Behavior

foldseek easy-search ${input} afdb50 result_file tmp --prefilter-mode 1 --cluster-search 1 will work without the error.

Your Environment

martin-steinegger commented 10 months ago

I can not reproduce this issues. I tried the same parameter with the afdb50 but it did not crash. Do you have a full error message?

YoshitakaMo commented 10 months ago

Thank you, Martin. Here is the error message:

easy-search job.pdb /mnt/foldseek/afdb50 result3.html tmp --alignment-type 2 --max-seqs 1000 -e 10 -s 9.5 --prefilter-mode 1 --cluster-search 1 --tmscore-threshold 0.3 --format-mode 3 --format-output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits,taxid,taxname,taxlineage 

MMseqs Version:                 96be67cfedf1491b3280c169714eabf207dbf796
Seq. id. threshold              0
Coverage threshold              0
Coverage mode                   0
Max reject                      2147483647
Max accept                      2147483647
Add backtrace                   false
TMscore threshold               0.3
TMalign hit order               0
TMalign fast                    1
Preload mode                    0
Threads                         32
Verbosity                       3
LDDT threshold                  0
Sort by structure bit score     1
Alignment type                  2
Substitution matrix             aa:3di.out,nucl:3di.out
Alignment mode                  3
Alignment mode                  0
E-value threshold               10
Min alignment length            0
Seq. id. mode                   0
Alternative alignments          0
Max sequence length             65535
Compositional bias              1
Compositional bias              1
Gap open cost                   aa:10,nucl:10
Gap extension cost              aa:1,nucl:1
Compressed                      0
Seed substitution matrix        aa:3di.out,nucl:3di.out
Sensitivity                     9.5
k-mer length                    6
Target search mode              0
k-score                         seq:2147483647,prof:2147483647
Max results per query           1000
Split database                  0
Split mode                      2
Split memory limit              0
Diagonal scoring                true
Exact k-mer matching            0
Mask residues                   0
Mask residues probability       0.99995
Mask lower case residues        1
Minimum diagonal score          30
Selected taxa                   
Spaced k-mers                   1
Spaced k-mer pattern            
Local temporary path            
Exhaustive search mode          false
Prefilter mode                  1
Search iterations               1
Remove temporary files          true
MPI runner                      
Force restart with latest tmp   false
Cluster search                  1
Chain name mode                 0
Write mapping file              0
Mask b-factor threshold         0
Coord store mode                2
Write lookup file               1
Tar Inclusion Regex             .*
Tar Exclusion Regex             ^$
File Inclusion Regex            .*
File Exclusion Regex            ^$
Alignment format                3
Format alignment output         query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits,taxid,taxname,taxlineage
Database output                 false
Greedy best hits                false

Alignment backtraces will be computed, since they were requested by output format.
createdb job.pdb tmp/5153076403984737512/query --chain-name-mode 0 --write-mapping 0 --mask-bfactor-threshold 0 --coord-store-mode 2 --write-lookup 1 --tar-include '.*' --tar-exclude '^$' --file-include '.*' --file-exclude '^$' --threads 32 -v 3 

Output file: tmp/5153076403984737512/query
[=================================================================] 1 0s 19ms
Time for merging to query_ss: 0h 0m 0s 167ms
Time for merging to query_h: 0h 0m 0s 106ms
Time for merging to query_ca: 0h 0m 0s 99ms
Time for merging to query: 0h 0m 0s 100ms
Ignore 0 out of 1.
Too short: 0, incorrect: 0, not proteins: 0.
Time for processing: 0h 0m 0s 916ms
Create directory tmp/5153076403984737512/search_tmp
search tmp/5153076403984737512/query /mnt/foldseek/afdb50 tmp/5153076403984737512/result tmp/5153076403984737512/search_tmp -a 1 --tmscore-threshold 0.3 --alignment-type 2 --alignment-mode 3 -e 10 --comp-bias-corr 1 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 -s 9.5 -k 6 --max-seqs 1000 --mask 0 --mask-prob 0.99995 --prefilter-mode 1 --remove-tmp-files 1 --cluster-search 1 

ungappedprefilter tmp/5153076403984737512/query_ss /mnt/foldseek/afdb50_ss tmp/5153076403984737512/search_tmp/11009748205448795967/pref --sub-mat 'aa:3di.out,nucl:3di.out' -c 0 -e 1.79769e+308 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 0.15 --min-ungapped-score 30 --max-seqs 1000 --db-load-mode 0 --threads 32 --compressed 0 -v 3 

[=================================================================] 1 5s 131ms
Time for merging to pref: 0h 0m 0s 2ms
Time for processing: 0h 0m 9s 740ms
structurealign tmp/5153076403984737512/query /mnt/foldseek/afdb50 tmp/5153076403984737512/search_tmp/11009748205448795967/pref tmp/5153076403984737512/search_tmp/11009748205448795967/strualn --tmscore-threshold 0.3 --lddt-threshold 0 --sort-by-structure-bits 1 --alignment-type 2 --sub-mat 'aa:3di.out,nucl:3di.out' -a 1 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 10 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 0.5 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --zdrop 40 --threads 32 --compressed 0 -v 3 

[=================================================================] 1 3s 567ms
Time for merging to strualn: 0h 0m 0s 229ms
Time for processing: 0h 0m 27s 1ms
mergeresultsbyset tmp/5153076403984737512/search_tmp/11009748205448795967/strualn /mnt/foldseek/afdb50 tmp/5153076403984737512/search_tmp/11009748205448795967/strualn_expanded --threads 32 --compressed 0 -v 3 

Time for merging to strualn_expanded: 0h 0m 0s 92ms
Time for processing: 0h 0m 2s 347ms
setextendeddbtype tmp/5153076403984737512/search_tmp/11009748205448795967/strualn_expanded --extended-dbtype 2 

Time for processing: 0h 0m 0s 3ms
structurealign tmp/5153076403984737512/query /mnt/foldseek/afdb50 tmp/5153076403984737512/search_tmp/11009748205448795967/strualn_expanded tmp/5153076403984737512/search_tmp/11009748205448795967/aln --tmscore-threshold 0.3 --lddt-threshold 0 --sort-by-structure-bits 1 --alignment-type 2 --sub-mat 'aa:3di.out,nucl:3di.out' -a 1 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 10 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 0.5 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --zdrop 40 --threads 32 --compressed 0 -v 3 

[=================================================================] 1 1s 688ms
Time for merging to aln: 0h 0m 0s 514ms
Time for processing: 0h 1m 41s 176ms
mvdb tmp/5153076403984737512/search_tmp/11009748205448795967/aln tmp/5153076403984737512/result -v 3 

Time for processing: 0h 0m 0s 284ms
Removing temporary files
rmdb tmp/5153076403984737512/search_tmp/11009748205448795967/strualn_expanded -v 3 

Time for processing: 0h 0m 0s 99ms
rmdb tmp/5153076403984737512/search_tmp/11009748205448795967/pref -v 3 

Time for processing: 0h 0m 0s 32ms
convertalis tmp/5153076403984737512/query /mnt/foldseek/afdb50_seq tmp/5153076403984737512/result result3.html --sub-mat 'aa:3di.out,nucl:3di.out' --format-mode 3 --format-output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits,taxid,taxname,taxlineage --translation-table 1 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --db-output 0 --db-load-mode 0 --search-type 0 --threads 32 --compressed 0 -v 3 

Error: Convert Alignments died

The /mnt/foldseek directory has these files (only related to afdb50):

-rw-r--r-- 1 root root  14187874580 Aug 18 14:47 afdb50
-rw-r--r-- 1 root root   1315395424 Aug 18 14:48 afdb50.index
-rw-r--r-- 1 root root            4 Aug 18 14:48 afdb50.dbtype
-rw-r--r-- 1 root root  55242286017 Aug 18 14:49 afdb50_seq.1
-rw-r--r-- 1 root root            4 Aug 18 14:51 afdb50_seq.dbtype
-rw-r--r-- 1 root root   5410026102 Aug 18 14:51 afdb50_seq.index
lrwxrwxrwx 1 root root            6 Aug 18 14:51 afdb50_seq.0 -> afdb50
-rw-r--r-- 1 root root   2956810604 Aug 18 14:51 afdb50_h
-rw-r--r-- 1 root root   1240528477 Aug 18 14:52 afdb50_h.index
-rw-r--r-- 1 root root            4 Aug 18 14:52 afdb50_h.dbtype
-rw-r--r-- 1 root root   9292316169 Aug 18 14:52 afdb50_seq_h.1
-rw-r--r-- 1 root root            4 Aug 18 14:53 afdb50_seq_h.dbtype
-rw-r--r-- 1 root root   5061775645 Aug 18 14:54 afdb50_seq_h.index
lrwxrwxrwx 1 root root            8 Aug 18 14:54 afdb50_seq_h.0 -> afdb50_h
-rw-r--r-- 1 root root  14187874580 Aug 18 14:56 afdb50_ss
-rw-r--r-- 1 root root   1315386104 Aug 18 14:57 afdb50_ss.index
-rw-r--r-- 1 root root            4 Aug 18 14:57 afdb50_ss.dbtype
-rw-r--r-- 1 root root  55242286017 Aug 18 14:58 afdb50_seq_ss.1
-rw-r--r-- 1 root root            4 Aug 18 14:59 afdb50_seq_ss.dbtype
-rw-r--r-- 1 root root   5410016782 Aug 18 15:00 afdb50_seq_ss.index
lrwxrwxrwx 1 root root            9 Aug 18 15:00 afdb50_seq_ss.0 -> afdb50_ss
-rw-r--r-- 1 root root  84912584040 Aug 18 15:14 afdb50_ca
-rw-r--r-- 1 root root   1391679690 Aug 18 15:14 afdb50_ca.index
-rw-r--r-- 1 root root            4 Aug 18 15:14 afdb50_ca.dbtype
-rw-r--r-- 1 root root 330809644226 Aug 18 15:22 afdb50_seq_ca.1
-rw-r--r-- 1 root root            4 Aug 18 15:24 afdb50_seq_ca.dbtype
-rw-r--r-- 1 root root   5775608431 Aug 18 15:24 afdb50_seq_ca.index
lrwxrwxrwx 1 root root            9 Aug 18 15:24 afdb50_seq_ca.0 -> afdb50_ca
-rw-r--r-- 1 root root   2089393040 Aug 18 15:24 afdb50_clu
-rw-r--r-- 1 root root   1234143489 Aug 18 15:24 afdb50_clu.index
-rw-r--r-- 1 root root            4 Aug 18 15:24 afdb50_clu.dbtype
-rw-r--r-- 1 root root   7943602272 Aug 18 15:25 afdb50.lookup
-rw-r--r-- 1 root root   1717470637 Aug 18 15:25 afdb50_mapping
-rw-r--r-- 1 root root    683101917 Aug 18 15:25 afdb50_taxonomy
lrwxrwxrwx 1 root root           15 Aug 18 15:25 afdb50_seq_taxonomy -> afdb50_taxonomy
lrwxrwxrwx 1 root root           14 Aug 18 15:25 afdb50_seq_mapping -> afdb50_mapping
lrwxrwxrwx 1 root root           13 Aug 18 15:25 afdb50_seq.lookup -> afdb50.lookup
-rw-r--r-- 1 root root            3 Sep  6 13:49 afdb50.version

When I also tried the job wIthout --cluster-search 1, results3.html file was generated as I expected.

martin-steinegger commented 10 months ago

Thank you Yoshi! Is the search working if you use the default format-mode? On Sep 8, 2023, at 00:44, Yoshitaka Moriwaki @.***> wrote: Thank you, Martin. Here is the error message: easy-search job.pdb /mnt/foldseek/afdb50 result3.html tmp --alignment-type 2 --max-seqs 1000 -e 10 -s 9.5 --prefilter-mode 1 --cluster-search 1 --tmscore-threshold 0.3 --format-mode 3 --format-output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits,taxid,taxname,taxlineage

MMseqs Version: 96be67cfedf1491b3280c169714eabf207dbf796 Seq. id. threshold 0 Coverage threshold 0 Coverage mode 0 Max reject 2147483647 Max accept 2147483647 Add backtrace false TMscore threshold 0.3 TMalign hit order 0 TMalign fast 1 Preload mode 0 Threads 32 Verbosity 3 LDDT threshold 0 Sort by structure bit score 1 Alignment type 2 Substitution matrix aa:3di.out,nucl:3di.out Alignment mode 3 Alignment mode 0 E-value threshold 10 Min alignment length 0 Seq. id. mode 0 Alternative alignments 0 Max sequence length 65535 Compositional bias 1 Compositional bias 1 Gap open cost aa:10,nucl:10 Gap extension cost aa:1,nucl:1 Compressed 0 Seed substitution matrix aa:3di.out,nucl:3di.out Sensitivity 9.5 k-mer length 6 Target search mode 0 k-score seq:2147483647,prof:2147483647 Max results per query 1000 Split database 0 Split mode 2 Split memory limit 0 Diagonal scoring true Exact k-mer matching 0 Mask residues 0 Mask residues probability 0.99995 Mask lower case residues 1 Minimum diagonal score 30 Selected taxa
Spaced k-mers 1 Spaced k-mer pattern
Local temporary path
Exhaustive search mode false Prefilter mode 1 Search iterations 1 Remove temporary files true MPI runner
Force restart with latest tmp false Cluster search 1 Chain name mode 0 Write mapping file 0 Mask b-factor threshold 0 Coord store mode 2 Write lookup file 1 Tar Inclusion Regex . Tar Exclusion Regex ^$ File Inclusion Regex . File Exclusion Regex ^$ Alignment format 3 Format alignment output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits,taxid,taxname,taxlineage Database output false Greedy best hits false

Alignment backtraces will be computed, since they were requested by output format. createdb job.pdb tmp/5153076403984737512/query --chain-name-mode 0 --write-mapping 0 --mask-bfactor-threshold 0 --coord-store-mode 2 --write-lookup 1 --tar-include '.' --tar-exclude '^$' --file-include '.' --file-exclude '^$' --threads 32 -v 3

Output file: tmp/5153076403984737512/query [=================================================================] 1 0s 19ms Time for merging to query_ss: 0h 0m 0s 167ms Time for merging to query_h: 0h 0m 0s 106ms Time for merging to query_ca: 0h 0m 0s 99ms Time for merging to query: 0h 0m 0s 100ms Ignore 0 out of 1. Too short: 0, incorrect: 0, not proteins: 0. Time for processing: 0h 0m 0s 916ms Create directory tmp/5153076403984737512/search_tmp search tmp/5153076403984737512/query /mnt/foldseek/afdb50 tmp/5153076403984737512/result tmp/5153076403984737512/search_tmp -a 1 --tmscore-threshold 0.3 --alignment-type 2 --alignment-mode 3 -e 10 --comp-bias-corr 1 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 -s 9.5 -k 6 --max-seqs 1000 --mask 0 --mask-prob 0.99995 --prefilter-mode 1 --remove-tmp-files 1 --cluster-search 1

ungappedprefilter tmp/5153076403984737512/query_ss /mnt/foldseek/afdb50_ss tmp/5153076403984737512/search_tmp/11009748205448795967/pref --sub-mat 'aa:3di.out,nucl:3di.out' -c 0 -e 1.79769e+308 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 0.15 --min-ungapped-score 30 --max-seqs 1000 --db-load-mode 0 --threads 32 --compressed 0 -v 3

[=================================================================] 1 5s 131ms Time for merging to pref: 0h 0m 0s 2ms Time for processing: 0h 0m 9s 740ms structurealign tmp/5153076403984737512/query /mnt/foldseek/afdb50 tmp/5153076403984737512/search_tmp/11009748205448795967/pref tmp/5153076403984737512/search_tmp/11009748205448795967/strualn --tmscore-threshold 0.3 --lddt-threshold 0 --sort-by-structure-bits 1 --alignment-type 2 --sub-mat 'aa:3di.out,nucl:3di.out' -a 1 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 10 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 0.5 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --zdrop 40 --threads 32 --compressed 0 -v 3

[=================================================================] 1 3s 567ms Time for merging to strualn: 0h 0m 0s 229ms Time for processing: 0h 0m 27s 1ms mergeresultsbyset tmp/5153076403984737512/search_tmp/11009748205448795967/strualn /mnt/foldseek/afdb50 tmp/5153076403984737512/search_tmp/11009748205448795967/strualn_expanded --threads 32 --compressed 0 -v 3

Time for merging to strualn_expanded: 0h 0m 0s 92ms Time for processing: 0h 0m 2s 347ms setextendeddbtype tmp/5153076403984737512/search_tmp/11009748205448795967/strualn_expanded --extended-dbtype 2

Time for processing: 0h 0m 0s 3ms structurealign tmp/5153076403984737512/query /mnt/foldseek/afdb50 tmp/5153076403984737512/search_tmp/11009748205448795967/strualn_expanded tmp/5153076403984737512/search_tmp/11009748205448795967/aln --tmscore-threshold 0.3 --lddt-threshold 0 --sort-by-structure-bits 1 --alignment-type 2 --sub-mat 'aa:3di.out,nucl:3di.out' -a 1 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 10 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 0.5 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --zdrop 40 --threads 32 --compressed 0 -v 3

[=================================================================] 1 1s 688ms Time for merging to aln: 0h 0m 0s 514ms Time for processing: 0h 1m 41s 176ms mvdb tmp/5153076403984737512/search_tmp/11009748205448795967/aln tmp/5153076403984737512/result -v 3

Time for processing: 0h 0m 0s 284ms Removing temporary files rmdb tmp/5153076403984737512/search_tmp/11009748205448795967/strualn_expanded -v 3

Time for processing: 0h 0m 0s 99ms rmdb tmp/5153076403984737512/search_tmp/11009748205448795967/pref -v 3

Time for processing: 0h 0m 0s 32ms convertalis tmp/5153076403984737512/query /mnt/foldseek/afdb50_seq tmp/5153076403984737512/result result3.html --sub-mat 'aa:3di.out,nucl:3di.out' --format-mode 3 --format-output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits,taxid,taxname,taxlineage --translation-table 1 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --db-output 0 --db-load-mode 0 --search-type 0 --threads 32 --compressed 0 -v 3

Error: Convert Alignments died The /mnt/foldseek directory has these files (only related to afdb50): -rw-r--r-- 1 root root 14187874580 Aug 18 14:47 afdb50 -rw-r--r-- 1 root root 1315395424 Aug 18 14:48 afdb50.index -rw-r--r-- 1 root root 4 Aug 18 14:48 afdb50.dbtype -rw-r--r-- 1 root root 55242286017 Aug 18 14:49 afdb50_seq.1 -rw-r--r-- 1 root root 4 Aug 18 14:51 afdb50_seq.dbtype -rw-r--r-- 1 root root 5410026102 Aug 18 14:51 afdb50_seq.index lrwxrwxrwx 1 root root 6 Aug 18 14:51 afdb50_seq.0 -> afdb50 -rw-r--r-- 1 root root 2956810604 Aug 18 14:51 afdb50_h -rw-r--r-- 1 root root 1240528477 Aug 18 14:52 afdb50_h.index -rw-r--r-- 1 root root 4 Aug 18 14:52 afdb50_h.dbtype -rw-r--r-- 1 root root 9292316169 Aug 18 14:52 afdb50_seq_h.1 -rw-r--r-- 1 root root 4 Aug 18 14:53 afdb50_seq_h.dbtype -rw-r--r-- 1 root root 5061775645 Aug 18 14:54 afdb50_seq_h.index lrwxrwxrwx 1 root root 8 Aug 18 14:54 afdb50_seq_h.0 -> afdb50_h -rw-r--r-- 1 root root 14187874580 Aug 18 14:56 afdb50_ss -rw-r--r-- 1 root root 1315386104 Aug 18 14:57 afdb50_ss.index -rw-r--r-- 1 root root 4 Aug 18 14:57 afdb50_ss.dbtype -rw-r--r-- 1 root root 55242286017 Aug 18 14:58 afdb50_seq_ss.1 -rw-r--r-- 1 root root 4 Aug 18 14:59 afdb50_seq_ss.dbtype -rw-r--r-- 1 root root 5410016782 Aug 18 15:00 afdb50_seq_ss.index lrwxrwxrwx 1 root root 9 Aug 18 15:00 afdb50_seq_ss.0 -> afdb50_ss -rw-r--r-- 1 root root 84912584040 Aug 18 15:14 afdb50_ca -rw-r--r-- 1 root root 1391679690 Aug 18 15:14 afdb50_ca.index -rw-r--r-- 1 root root 4 Aug 18 15:14 afdb50_ca.dbtype -rw-r--r-- 1 root root 330809644226 Aug 18 15:22 afdb50_seq_ca.1 -rw-r--r-- 1 root root 4 Aug 18 15:24 afdb50_seq_ca.dbtype -rw-r--r-- 1 root root 5775608431 Aug 18 15:24 afdb50_seq_ca.index lrwxrwxrwx 1 root root 9 Aug 18 15:24 afdb50_seq_ca.0 -> afdb50_ca -rw-r--r-- 1 root root 2089393040 Aug 18 15:24 afdb50_clu -rw-r--r-- 1 root root 1234143489 Aug 18 15:24 afdb50_clu.index -rw-r--r-- 1 root root 4 Aug 18 15:24 afdb50_clu.dbtype -rw-r--r-- 1 root root 7943602272 Aug 18 15:25 afdb50.lookup -rw-r--r-- 1 root root 1717470637 Aug 18 15:25 afdb50_mapping -rw-r--r-- 1 root root 683101917 Aug 18 15:25 afdb50_taxonomy lrwxrwxrwx 1 root root 15 Aug 18 15:25 afdb50_seq_taxonomy -> afdb50_taxonomy lrwxrwxrwx 1 root root 14 Aug 18 15:25 afdb50_seq_mapping -> afdb50_mapping lrwxrwxrwx 1 root root 13 Aug 18 15:25 afdb50_seq.lookup -> afdb50.lookup -rw-r--r-- 1 root root 3 Sep 6 13:49 afdb50.version

When I also tried the job wIthout --cluster-search 1, results3.html file was generated as I expected.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

YoshitakaMo commented 10 months ago

When I ran the same job without --format-mode 3 (HTML) and with --cluster-search 1, the job was successfully finished! Thank you, Martin. But, it would be nice if the cluster search could also work for --format-mode 3.

easy-search job.pdb /mnt/foldseek/afdb50 result3.m8 tmp --alignment-type 2 --max-seqs 1000 -e 10 -s 9.5 --prefilter-mode 1 --tmscore-threshold 0.3 --cluster-search 1

MMseqs Version:                 96be67cfedf1491b3280c169714eabf207dbf796
Seq. id. threshold              0
Coverage threshold              0
Coverage mode                   0
Max reject                      2147483647
Max accept                      2147483647

....
....

Time for processing: 0h 0m 0s 4ms
convertalis tmp/5091324274358008566/query /mnt/foldseek/afdb50_seq tmp/5091324274358008566/result result3.m8 --sub-mat 'aa:3di.out,nucl:3di.out' --format-mode 0 --format-output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits --translation-table 1 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --db-output 0 --db-load-mode 0 --search-type 0 --threads 32 --compressed 0 -v 3

[=================================================================] 1 0s 0ms
Time for merging to result3.html: 0h 0m 0s 3ms
Time for processing: 0h 0m 13s 707ms
rmdb tmp/5091324274358008566/result -v 3

Time for processing: 0h 0m 0s 11ms
rmdb tmp/5091324274358008566/query -v 3

Time for processing: 0h 0m 0s 2ms
rmdb tmp/5091324274358008566/query_h -v 3

Time for processing: 0h 0m 0s 2ms
rmdb tmp/5091324274358008566/query_ca -v 3

Time for processing: 0h 0m 0s 2ms
rmdb tmp/5091324274358008566/query_ss -v 3

Time for processing: 0h 0m 0s 2ms
martin-steinegger commented 10 months ago

Thank you so much. This is fixed now.