Open pisle0 opened 4 weeks ago
Could you please post the whole log?
Sure, here is the whole log:
Create directory ./results/foldseek_0/tmp
easy-search ./db/db_0 ../UniProt ./results/foldseek_0/foldseek_0.m8 ./results/foldseek_0/tmp --format-mode 4 --format-output query,target,evalue,gapopen,pident,fident,nident,qstart,qend,qlen,tstart,tend,tlen,alnlen,bits,mismatch,qcov,tcov,qset,qsetid,tset,tsetid,lddt,qtmscore,ttmscore,alntmscore,prob -e 1e-5 --threads 95
MMseqs Version: 928984bfa3c7c3c98ca58b557d965965038f7e0b
Seq. id. threshold 0
Coverage threshold 0
Coverage mode 0
Max reject 2147483647
Max accept 2147483647
Add backtrace false
TMscore threshold 0
TMalign hit order 0
TMalign fast 1
Preload mode 0
Threads 95
Verbosity 3
LDDT threshold 0
Sort by structure bit score 1
Alignment type 2
Exact TMscore 0
Substitution matrix aa:3di.out,nucl:3di.out
Alignment mode 3
Alignment mode 0
E-value threshold 1e-05
Min alignment length 0
Seq. id. mode 0
Alternative alignments 0
Max sequence length 65535
Compositional bias 1
Compositional bias 1
Gap open cost aa:10,nucl:10
Gap extension cost aa:1,nucl:1
Compressed 0
Seed substitution matrix aa:3di.out,nucl:3di.out
Sensitivity 9.5
k-mer length 6
Target search mode 0
k-score seq:2147483647,prof:2147483647
Max results per query 1000
Split database 0
Split mode 2
Split memory limit 0
Diagonal scoring true
Exact k-mer matching 0
Mask residues 0
Mask residues probability 0.99995
Mask lower case residues 1
Minimum diagonal score 30
Selected taxa
Spaced k-mers 1
Spaced k-mer pattern
Local temporary path
Exhaustive search mode false
Prefilter mode 0
Search iterations 1
Remove temporary files true
MPI runner
Force restart with latest tmp false
Cluster search 0
Path to ProstT5
Chain name mode 0
Write mapping file 0
Mask b-factor threshold 0
Coord store mode 2
Write lookup file 1
Input format 0
File Inclusion Regex .*
File Exclusion Regex ^$
Alignment format 4
Format alignment output query,target,evalue,gapopen,pident,fident,nident,qstart,qend,qlen,tstart,tend,tlen,alnlen,bits,mismatch,qcov,tcov,qset,qsetid,tset,tsetid,lddt,qtmscore,ttmscore,alntmscore,prob
Database output false
Greedy best hits false
Alignment backtraces will be computed, since they were requested by output format.
Create directory ./results/foldseek_0/tmp/6098992401104940622/search_tmp
search ./db/db_0 ../UniProt ./results/foldseek_0/tmp/6098992401104940622/result ./results/foldseek_0/tmp/6098992401104940622/search_tmp -a 1 --threads 95 --alignment-mode 3 -e 1e-05 --comp-bias-corr 1 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 -s 9.5 -k 6 --mask 0 --mask-prob 0.99995 --remove-tmp-files 1
prefilter ./db/db_0_ss ../UniProt_ss ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/pref --sub-mat 'aa:3di.out,nucl:3di.out' --seed-sub-mat 'aa:3di.out,nucl:3di.out' -s 9.5 -k 6 --target-search-mode 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 1000 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 0.15 --diag-score 1 --exact-kmer-matching 0 --mask 0 --mask-prob 0.99995 --mask-lower-case 1 --min-ungapped-score 30 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 95 --compressed 0 -v 3
Query database size: 2340 type: Aminoacid
Target split mode. Searching through 11 splits
Estimated memory consumption: 139G
Target database size: 214683829 type: Aminoacid
Process prefiltering step 1 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.56M 10s 743ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.56M 17s 198ms
Index statistics
Entries: 5380832714
DB size: 31277 MB
Avg k-mer size: 84.075511
Top 10 k-mers
LVLVVV 7419030
SVSVVV 6859272
VVSVVV 5411307
SVVVVV 5262829
VVVSVS 3880468
VSVVVV 3812658
DDVVVV 3438600
VLVLLV 3170092
VVVVLV 3121737
VSVSVV 2987097
Time for index table init: 0h 1m 3s 536ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 1 of 11)
Query db start 1 to 2340
Target db start 1 to 19555737
[=================================================================] 2.34K 30m 3s 933ms
3322.179038 k-mers per position
2210082916 DB matches per sequence
2304 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_0: 0h 0m 0s 4ms
Time for merging to pref_tmp_0_tmp: 0h 0m 0s 7ms
Process prefiltering step 2 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.40M 9s 429ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.40M 16s 168ms
Index statistics
Entries: 5347628169
DB size: 31087 MB
Avg k-mer size: 83.556690
Top 10 k-mers
LVLVVV 7362357
SVSVVV 6816351
VVSVVV 5405207
SVVVVV 5255422
LVVVVV 4885042
VVVSVS 3864839
VSVVVV 3826851
DDVVVV 3480728
VLVLLV 3157731
VVVVLV 3134996
Time for index table init: 0h 1m 1s 398ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 2 of 11)
Query db start 1 to 2340
Target db start 19555738 to 38959624
[=================================================================] 2.34K 29m 58s 797ms
3322.179038 k-mers per position
2211639591 DB matches per sequence
2305 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_1: 0h 0m 0s 2ms
Time for merging to pref_tmp_1_tmp: 0h 0m 0s 7ms
Process prefiltering step 3 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.42M 9s 340ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.42M 16s 703ms
Index statistics
Entries: 5345430013
DB size: 31075 MB
Avg k-mer size: 83.522344
Top 10 k-mers
LVLVVV 7339400
SVSVVV 6796668
VVSVVV 5392562
SVVVVV 5251138
LVVVVV 4872815
VVVSVS 3857049
VSVVVV 3827082
DDVVVV 3487809
VLVLLV 3147702
VVVVLV 3136874
Time for index table init: 0h 1m 1s 621ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 3 of 11)
Query db start 1 to 2340
Target db start 38959625 to 58376168
[=================================================================] 2.34K 30m 9s 735ms
3322.179038 k-mers per position
2219785938 DB matches per sequence
2305 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_2: 0h 0m 0s 2ms
Time for merging to pref_tmp_2_tmp: 0h 0m 0s 7ms
Process prefiltering step 4 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.50M 9s 488ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.50M 15s 734ms
Index statistics
Entries: 5366001521
DB size: 31192 MB
Avg k-mer size: 83.843774
Top 10 k-mers
LVLVVV 7387244
SVSVVV 6833776
VVSVVV 5404097
SVVVVV 5247653
VVVSVS 3877507
VSVVVV 3809981
DDVVVV 3453950
VLVLLV 3162578
VVVVLV 3123304
VSVSVV 2977791
Time for index table init: 0h 1m 1s 74ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 4 of 11)
Query db start 1 to 2340
Target db start 58376169 to 77880679
[=================================================================] 2.34K 30m 3s 901ms
3322.179038 k-mers per position
2213016830 DB matches per sequence
2304 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_3: 0h 0m 0s 4ms
Time for merging to pref_tmp_3_tmp: 0h 0m 0s 7ms
Process prefiltering step 5 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.60M 9s 389ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.60M 16s 352ms
Index statistics
Entries: 5367642730
DB size: 31202 MB
Avg k-mer size: 83.869418
Top 10 k-mers
LVLVVV 7408638
SVSVVV 6842541
VVSVVV 5410304
SVVVVV 5267791
VVVSVS 3879765
VSVVVV 3831500
NVSVVV 3791072
DDVVVV 3462511
VLVLLV 3174025
VVVVLV 3144011
Time for index table init: 0h 1m 1s 441ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 5 of 11)
Query db start 1 to 2340
Target db start 77880680 to 97476863
[=================================================================] 2.34K 30m 10s 983ms
3322.179038 k-mers per position
2217079267 DB matches per sequence
2304 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_4: 0h 0m 0s 3ms
Time for merging to pref_tmp_4_tmp: 0h 0m 0s 7ms
Process prefiltering step 6 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.51M 9s 451ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.51M 15s 659ms
Index statistics
Entries: 5355201023
DB size: 31130 MB
Avg k-mer size: 83.675016
Top 10 k-mers
LVLVVV 7372030
SVSVVV 6811966
VVSVVV 5399688
SVVVVV 5256717
VVVSVS 3866126
VSVVVV 3826900
VVVVVS 3499301
DDVVVV 3498709
VLVLLV 3164470
VVVVLV 3139129
Time for index table init: 0h 1m 0s 656ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 6 of 11)
Query db start 1 to 2340
Target db start 97476864 to 116986050
[=================================================================] 2.34K 30m 15s 483ms
3322.179038 k-mers per position
2214779072 DB matches per sequence
2305 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_5: 0h 0m 0s 3ms
Time for merging to pref_tmp_5_tmp: 0h 0m 0s 7ms
Process prefiltering step 7 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.59M 9s 397ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.59M 16s 434ms
Index statistics
Entries: 5366327648
DB size: 31194 MB
Avg k-mer size: 83.848870
Top 10 k-mers
LVLVVV 7403264
SVSVVV 6846618
VVSVVV 5419233
SVVVVV 5269344
VVVSVS 3886329
VSVVVV 3829408
NVSVVV 3788360
DDVVVV 3469383
VLVLLV 3166475
VVVVLV 3147738
Time for index table init: 0h 1m 1s 675ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 7 of 11)
Query db start 1 to 2340
Target db start 116986051 to 136573459
[=================================================================] 2.34K 30m 3s 196ms
3322.179038 k-mers per position
2214895418 DB matches per sequence
2304 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_6: 0h 0m 0s 1ms
Time for merging to pref_tmp_6_tmp: 0h 0m 0s 6ms
Process prefiltering step 8 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.41M 9s 406ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.41M 15s 677ms
Index statistics
Entries: 5345802812
DB size: 31077 MB
Avg k-mer size: 83.528169
Top 10 k-mers
LVLVVV 7333385
SVSVVV 6795473
VVSVVV 5378647
SVVVVV 5240783
LVVVVV 4866087
VVVVPP 3861707
VVVSVS 3841153
VSVVVV 3821507
DDVVVV 3491345
VVVVLV 3128082
Time for index table init: 0h 1m 0s 588ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 8 of 11)
Query db start 1 to 2340
Target db start 136573460 to 155981499
[=================================================================] 2.34K 30m 5s 448ms
3322.179038 k-mers per position
2220573568 DB matches per sequence
2305 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_7: 0h 0m 0s 2ms
Time for merging to pref_tmp_7_tmp: 0h 0m 0s 7ms
Process prefiltering step 9 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.48M 9s 670ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.48M 15s 581ms
Index statistics
Entries: 5367214664
DB size: 31199 MB
Avg k-mer size: 83.862729
Top 10 k-mers
LVLVVV 7389205
SVSVVV 6824085
VVSVVV 5383258
SVVVVV 5250820
LVVVVV 4881895
VVVSVS 3865754
VSVVVV 3814827
DDVVVV 3461080
VLVLLV 3163401
VVVVLV 3132693
Time for index table init: 0h 1m 0s 568ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 9 of 11)
Query db start 1 to 2340
Target db start 155981500 to 175456597
[=================================================================] 2.34K 30m 17s 545ms
3322.179038 k-mers per position
2220547956 DB matches per sequence
2305 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_8: 0h 0m 0s 3ms
Time for merging to pref_tmp_8_tmp: 0h 0m 0s 7ms
Process prefiltering step 10 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.56M 9s 379ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.56M 16s 252ms
Index statistics
Entries: 5387033877
DB size: 31313 MB
Avg k-mer size: 84.172404
Top 10 k-mers
LVLVVV 7420945
SVSVVV 6850362
VVSVVV 5409998
SVVVVV 5242817
VVVSVS 3880147
VSVVVV 3802596
DDVVVV 3422754
VLVLLV 3161209
VVVVLV 3120638
VSVSVV 2978610
Time for index table init: 0h 1m 1s 181ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 10 of 11)
Query db start 1 to 2340
Target db start 175456598 to 195012177
[=================================================================] 2.34K 30m 9s 562ms
3322.179038 k-mers per position
2212344872 DB matches per sequence
2304 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_9: 0h 0m 0s 4ms
Time for merging to pref_tmp_9_tmp: 0h 0m 0s 7ms
Process prefiltering step 11 of 11
Index table k-mer threshold: 78 at k-mer size 6
Index table: counting k-mers
[=================================================================] 19.67M 9s 495ms
Index table: Masked residues: 0
Index table: fill
[=================================================================] 19.67M 16s 393ms
Index statistics
Entries: 5368817694
DB size: 31208 MB
Avg k-mer size: 83.887776
Top 10 k-mers
LVLVVV 7413874
SVSVVV 6844630
VVSVVV 5418602
SVVVVV 5273216
VVVSVS 3879932
VSVVVV 3831910
VVVVVS 3505573
DDVVVV 3486660
VLVLLV 3167080
VVVVLV 3152161
Time for index table init: 0h 1m 1s 753ms
k-mer similarity threshold: 78
Starting prefiltering scores calculation (step 11 of 11)
Query db start 1 to 2340
Target db start 195012178 to 214683829
[=================================================================] 2.34K 30m 7s 536ms
3322.179038 k-mers per position
2214850539 DB matches per sequence
2304 overflows
127 sequences passed prefiltering per query sequence
128 median result list length
4 sequences with 0 size result lists
Time for merging to pref_tmp_10: 0h 0m 0s 3ms
Time for merging to pref_tmp_10_tmp: 0h 0m 0s 6ms
Merging 11 target splits to pref
Preparing offsets for merging: 0h 0m 0s 1ms
[=================================================================] 2.34K 0s 69ms
Time for merging to pref: 0h 0m 0s 2ms
Time for merging target splits: 0h 0m 0s 125ms
Time for merging to pref_tmp: 0h 0m 0s 33ms
Time for processing: 6h 11m 7s 951ms
structurealign ./db/db_0 ../UniProt ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/pref ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/strualn --tmscore-threshold 0 --lddt-threshold 0 --sort-by-structure-bits 1 --alignment-type 2 --exact-tmscore 0 --sub-mat 'aa:3di.out,nucl:3di.out' -a 1 --alignment-mode 3 --alignment-output-mode 0 --wrapped-scoring 0 -e 1e-05 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 0.5 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:10,nucl:10 --gap-extend aa:1,nucl:1 --zdrop 40 --threads 95 --compressed 0 -v 3
Can not touch 415722228266 into main memory
[=================================================================] 2.34K 2m 12s 776ms
Time for merging to strualn: 0h 0m 0s 9ms
Time for processing: 0h 7m 4s 177ms
mvdb ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/strualn ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/aln
Time for processing: 0h 0m 0s 1ms
mvdb ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/aln ./results/foldseek_0/tmp/6098992401104940622/result -v 3
Time for processing: 0h 0m 0s 3ms
Removing temporary files
rmdb ./results/foldseek_0/tmp/6098992401104940622/search_tmp/9988097761208953728/pref -v 3
Time for processing: 0h 0m 0s 4ms
Error: Convert Alignments died
Does it crash without --format-mode 4
?
Hi, I tried the following:
--format-mode 4
but with --format-output
kept, same error: Error: Convert Alignments died
--format-mode 4
and --format-output
, convertalis
runs but the Can not touch 415722228266 into main memory
still persists.--format-mode 4
but without --format-output
, same as 2It looks like it must be one of the extra columns requested in the output causing the crash, do you have a rough idea of which one(s) may cause this issue?
Hi, I want to follow up to see if you have any insight to this? Could this be incompatibility between db built using previous versions of foldseek? Thank you in advance.
I have currently very limited time to look into this. My best guess right now is that one of these fields is causing a crash for some reason. I would guess that it might be one of the set ones, since these are less well tested:
qset,qsetid,tset,tsetid,lddt,qtmscore,ttmscore,alntmscore,prob
The databases should remain compatible between versions, so I don't think this is the issue.
I am trying to run a
foldseek easy-search
job with the UniProt database built fromfoldseek databases Alphafold/UniProt
. With two different attempts on different machines (with up to 256GB memory), the job completesprefilter
andstructurealign
, but immediately after thestructurealign
, it throws the following error trying allocate 415722228266 bytes of memory, and quits withError: Convert Alignments died
:I want to ask if this is purely a memory availability issue, and if there are ways to apply
--split-memory-limit
similar in theprefilter
step. Thank you