Open samirelanduk opened 1 year ago
I also get the "Invalid database read for database data file" error from expandaln
when called by colabfold_search
.
(I originally posted this on Issue 64 before I realized that that Issue was closed.)
Invalid database read for database data file=/home/username/project/my_local_DB/target_DB.idx, database index=/home/username/project/my_local_DB/target_DB.idx.index
getData: local id (4294967295) >= db size (22)
I created target_DB
from target.fasta
which has 142 records in it:
pwd
# /home/username/project/my_local_DB
mmseqs createdb target.fasta target_DB
mmseqs createindex target_DB tmp_createindex --threads 96
indexdb target_DB target_DB --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -k 0 --alph-size aa:21,nucl:5 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-seq-len 65535 --max-seqs 300 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --spaced-kmer-mode 1 -s 7.5 --k-score seq:0,prof:0 --check-compatible 0 --search-type 0 --split 0 --split-memory-limit 0 -v 3 --threads 96
Then I ran colabfold_search
. Output is below.
CUDA_VISIBLE_DEVICES='0' colabfold_search
-s '8'
--db1 'target_DB'
--use-templates '0'
--db2 ''
--use-env '0'
--db3 ''
--filter '1'
--mmseqs 'mmseqs'
--expand-eval '1.7e+308'
--align-eval '10'
--diff '3000'
--qsc '-20.0'
--max-accept '1000000'
--db-load-mode '2'
--threads '96'
query.fasta
/home/username/project/my_local_DB
result_query_20230412_142303
createdb result_query_20230412_142303/query.fas result_query_20230412_142303/qdb --shuffle 0
search result_query_20230412_142303/qdb /home/username/project/my_local_DB/target_DB result_query_20230412_142303/res result_query_20230412_142303/tmp --threads 96 --num-iterations 3 --db-load-mode 2 -a -s 8 -e 0.1 --max-seqs 10000
prefilter result_query_20230412_142303/qdb /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/tmp/18292001434761310910/pref_0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -s 8 -k 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 10000 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 2 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 96 --compressed 0 -v 3
align result_query_20230412_142303/qdb /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/tmp/18292001434761310910/pref_0 result_query_20230412_142303/tmp/18292001434761310910/aln_0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 1 --alignment-mode 2 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.1 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 2 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 1 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 96 --compressed 0 -v 3
result2profile result_query_20230412_142303/qdb /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/tmp/18292001434761310910/aln_0 result_query_20230412_142303/tmp/18292001434761310910/profile_0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -e 0.1 --mask-profile 1 --e-profile 0.1 --comp-bias-corr 1 --comp-bias-corr-scale 1 --wg 0 --allow-deletion 0 --filter-msa 1 --filter-min-enable 0 --max-seq-id 0.9 --qid '0.0' --qsc -20 --cov 0 --diff 1000 --pseudo-cnt-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --db-load-mode 2 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --gap-pc 10 --threads 96 --compressed 0 -v 3
subtractdbs result_query_20230412_142303/tmp/18292001434761310910/pref_tmp_1 result_query_20230412_142303/tmp/18292001434761310910/aln_0 result_query_20230412_142303/tmp/18292001434761310910/pref_1 --threads 96 --e-profile 0.1 -e 0.1 --compressed 0 -v 3
subtractdbs result_query_20230412_142303/tmp/18292001434761310910/pref_tmp_1 result_query_20230412_142303/tmp/18292001434761310910/aln_0 result_query_20230412_142303/tmp/18292001434761310910/pref_1 --threads 96 --e-profile 0.1 -e 0.1 --compressed 0 -v 3
align result_query_20230412_142303/tmp/18292001434761310910/profile_0 /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/tmp/18292001434761310910/pref_1 result_query_20230412_142303/tmp/18292001434761310910/aln_tmp_1 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 1 --alignment-mode 2 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.1 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 2 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 96 --compressed 0 -v 3
mergedbs result_query_20230412_142303/tmp/18292001434761310910/profile_0 result_query_20230412_142303/tmp/18292001434761310910/aln_1 result_query_20230412_142303/tmp/18292001434761310910/aln_0 result_query_20230412_142303/tmp/18292001434761310910/aln_tmp_1
rmdb result_query_20230412_142303/tmp/18292001434761310910/aln_0
rmdb result_query_20230412_142303/tmp/18292001434761310910/aln_tmp_1
result2profile result_query_20230412_142303/tmp/18292001434761310910/profile_0 /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/tmp/18292001434761310910/aln_1 result_query_20230412_142303/tmp/18292001434761310910/profile_1 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -e 0.1 --mask-profile 1 --e-profile 0.1 --comp-bias-corr 1 --comp-bias-corr-scale 1 --wg 0 --allow-deletion 0 --filter-msa 1 --filter-min-enable 0 --max-seq-id 0.9 --qid '0.0' --qsc -20 --cov 0 --diff 1000 --pseudo-cnt-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --db-load-mode 2 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --gap-pc 10 --threads 96 --compressed 0 -v 3
prefilter result_query_20230412_142303/tmp/18292001434761310910/profile_1 /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/tmp/18292001434761310910/pref_tmp_2 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -s 8 -k 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 10000 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 2 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 96 --compressed 0 -v 3
subtractdbs result_query_20230412_142303/tmp/18292001434761310910/pref_tmp_2 result_query_20230412_142303/tmp/18292001434761310910/aln_1 result_query_20230412_142303/tmp/18292001434761310910/pref_2 --threads 96 --e-profile 0.1 -e 0.1 --compressed 0 -v 3
subtractdbs result_query_20230412_142303/tmp/18292001434761310910/pref_tmp_2 result_query_20230412_142303/tmp/18292001434761310910/aln_1 result_query_20230412_142303/tmp/18292001434761310910/pref_2 --threads 96 --e-profile 0.1 -e 0.1 --compressed 0 -v 3
align result_query_20230412_142303/tmp/18292001434761310910/profile_1 /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/tmp/18292001434761310910/pref_2 result_query_20230412_142303/tmp/18292001434761310910/aln_tmp_2 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 1 --alignment-mode 2 --alignment-output-mode 0 --wrapped-scoring 0 -e 0.1 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 2 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 96 --compressed 0 -v 3
mergedbs result_query_20230412_142303/tmp/18292001434761310910/profile_1 result_query_20230412_142303/res result_query_20230412_142303/tmp/18292001434761310910/aln_1 result_query_20230412_142303/tmp/18292001434761310910/aln_tmp_2
rmdb result_query_20230412_142303/tmp/18292001434761310910/aln_1
rmdb result_query_20230412_142303/tmp/18292001434761310910/aln_tmp_2
expandaln result_query_20230412_142303/qdb /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/res /home/username/project/my_local_DB/target_DB.idx result_query_20230412_142303/res_exp --db-load-mode 2 --threads 96 --expansion-mode 0 -e 1.7976931348623157e+308 --expand-filter-clusters 1 --max-seq-id 0.95
MMseqs Version: 67949d702dbfc6e5d54fdd0f14a9ab6740f11c32
Expansion mode 0
Substitution matrix aa:blosum62.out,nucl:nucleotide.out
Gap open cost aa:11,nucl:5
Gap extension cost aa:1,nucl:2
Max sequence length 65535
Score bias 0
Compositional bias 1
Compositional bias 1
E-value threshold 1.79769e+308
Seq. id. threshold 0
Coverage threshold 0
Coverage mode 0
Pseudo count mode 0
Pseudo count a substitution:1.100,context:1.400
Pseudo count b substitution:4.100,context:5.800
Expand filter clusters 1
Use filter only at N seqs 0
Maximum seq. id. threshold 0.95
Minimum seq. id. 0.0
Minimum score per column -20
Minimum coverage 0
Select N most diverse seqs 1000
Preload mode 2
Compressed 0
Threads 96
Verbosity 3
Index version: 16
Generated by: 67949d702dbfc6e5d54fdd0f14a9ab6740f11c32
ScoreMatrix: VTML80.out
Index version: 16
Generated by: 67949d702dbfc6e5d54fdd0f14a9ab6740f11c32
ScoreMatrix: VTML80.out
Invalid database read for database data file=/home/username/project/my_local_DB/target_DB.idx, database index=/home/username/project/my_local_DB/target_DB.idx.index
getData: local id (4294967295) >= db size (22)
Traceback (most recent call last):
File "/home/username/project/colabfold_batch/colabfold-conda/bin/colabfold_search", line 8, in <module>
sys.exit(main())
File "/home/username/project/colabfold_batch/colabfold-conda/lib/python3.7/site-packages/colabfold/mmseqs/search.py", line 444, in main
threads=args.threads,
File "/home/username/project/colabfold_batch/colabfold-conda/lib/python3.7/site-packages/colabfold/mmseqs/search.py", line 86, in mmseqs_search_monomer
run_mmseqs(mmseqs, ["expandaln", base.joinpath("qdb"), dbbase.joinpath(f"{uniref_db}{dbSuffix1}"), base.joinpath("res"), dbbase.joinpath(f"{uniref_db}{dbSuffix2}"), base.joinpath("res_exp"), "--db-load-mode", str(db_load_mode), "--threads", str(threads)] + expand_param)
File "/home/username/project/colabfold_batch/colabfold-conda/lib/python3.7/site-packages/colabfold/mmseqs/search.py", line 23, in run_mmseqs
subprocess.check_call([mmseqs] + params)
File "/home/username/project/colabfold_batch/colabfold-conda/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '[PosixPath('mmseqs'), 'expandaln', PosixPath('result_query_20230412_142303/qdb'), PosixPath('/home/username/project/my_local_DB/target_DB.idx'), PosixPath('result_query_20230412_142303/res'), PosixPath('/home/username/project/my_local_DB/target_DB.idx'), PosixPath('result_query_20230412_142303/res_exp'), '--db-load-mode', '2', '--threads', '96', '--expansion-mode', '0', '-e', '1.7976931348623157e+308', '--expand-filter-clusters', '1', '--max-seq-id', '0.95']' returned non-zero exit status 1.
target_DB
is a brand new database; I have not added nor deleted records after its creation.
I am working on Lambda server running Ubuntu:
Linux xyz-lambda02 5.4.0-144-generic #161-Ubuntu SMP Fri Feb 3 14:49:04 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Please let me know if I can help with debugging.
Thank you. And thanks for mmseqs
.
I got the same error but in different place, I ran local colabfold API Server, the error message is
Invalid database read for database data file=/data/colabFold/MsaServer/databases/uniref30_2202_db.idx, database index=/data/colabFold/MsaServer/databases/uniref30_2202_db.idx.index getData: local id (4294967295) >= db size (22)
Thanks
The
expandaln
command fails to properly read index, producing an 'Invalid database read for database data
' errorExpected Behavior
Command to run without error messages.
Current Behavior
Command fails instantly with following error message:
Steps to Reproduce (for bugs)
MMseqs Output (for bugs)
createdb:
createindex:
expandaln:
Context
I am attempting to recreate the functionality in https://github.com/soedinglab/MMseqs2-App/blob/master/backend/worker.go
Your Environment
Include as many relevant details about the environment you experienced the bug in.