Closed simone-pignotti closed 5 years ago
Thanks for trying the MMseqs2 webserver!
I've triggered rebuilding the docker images based on the latest MMseqs2 image. The issue should be fixed there. I can do more detailed testing in a couple days though.
Thank you for the quick fix, I'll test it on my instance and will let you know if the problem persists.
I tried the latest mmseqs-app-* docker images (running docker-compose pull
and docker-compose up --build
), and I got a new error (see full log for details).
mmseqs-web-worker_1 | Unrecognized parameter --index-type
mmseqs-web-worker_1 | Did you mean "--search-type"?
mmseqs-web-worker_1 | 2019/03/05 15:21:17 Execution Error: exit status 1
It is triggered by the createindex
command, but I can't find the specific command in the log.
dockerup.log
I will look at the problem in detail in the evening. I know whats wrong, but its a little bit of a larger fix.
Update: The docker images should work again. I split search
field in the .params
file into two separate fields: index
und search
one for the indexing step and one for the actual search.
There is one more minor issue that I'll resolve soon, that MMseqs2 is currently not cleaning up correctly after itself and will leave an unnecessary number of files for each job. I'll fix that as soon as I can.
I keep getting errors with the updated image:
mmseqs-web-worker_1 | 2019/03/11 12:41:25 MMseqs2 worker
mmseqs-web-worker_1 | Program call:
mmseqs-web-worker_1 | createdb /opt/mmseqs-web/databases/test.fasta /opt/mmseqs-web/databases/test
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | MMseqs Version: efbd8d3b2f808c43c4e1629d8e74eb72cc8e92ba
mmseqs-web-worker_1 | Max. sequence length 65535
mmseqs-web-worker_1 | Split Seq. by len true
mmseqs-web-worker_1 | Database type 0
mmseqs-web-worker_1 | Do not shuffle input database true
mmseqs-web-worker_1 | Offset of numeric ids 0
mmseqs-web-worker_1 | Compressed 0
mmseqs-web-worker_1 | Verbosity 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | Assuming DNA database, forcing parameter --dont-split-seq-by-len true
mmseqs-web-worker_1 | .......Time for merging files: 0h 0m 0s 139ms
mmseqs-web-worker_1 | Time for merging files: 0h 0m 0s 613ms
mmseqs-web-worker_1 | Time for merging files: 0h 0m 0s 10ms
mmseqs-web-worker_1 | Time for processing: 0h 0m 3s 889ms
mmseqs-web-worker_1 | Program call:
mmseqs-web-worker_1 | createindex /opt/mmseqs-web/databases/test /tmp --remove-tmp-files true --check-compatible true
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | MMseqs Version: efbd8d3b2f808c43c4e1629d8e74eb72cc8e92ba
mmseqs-web-worker_1 | Seed Substitution Matrix PAM30.out
mmseqs-web-worker_1 | K-mer size 0
mmseqs-web-worker_1 | Alphabet size 21
mmseqs-web-worker_1 | Compositional bias 1
mmseqs-web-worker_1 | Max. sequence length 65535
mmseqs-web-worker_1 | Mask Residues 1
mmseqs-web-worker_1 | Spaced Kmer 1
mmseqs-web-worker_1 | Spaced k-mer pattern
mmseqs-web-worker_1 | Sensitivity 7.5
mmseqs-web-worker_1 | K-score 0
mmseqs-web-worker_1 | Check Compatible true
mmseqs-web-worker_1 | Search type 0
mmseqs-web-worker_1 | Split DB 0
mmseqs-web-worker_1 | Split Memory Limit 0
mmseqs-web-worker_1 | Threads 24
mmseqs-web-worker_1 | Verbosity 3
mmseqs-web-worker_1 | Min codons in orf 30
mmseqs-web-worker_1 | Max codons in length 98202
mmseqs-web-worker_1 | Max orf gaps 2147483647
mmseqs-web-worker_1 | Contig start mode 2
mmseqs-web-worker_1 | Contig end mode 2
mmseqs-web-worker_1 | Orf start mode 1
mmseqs-web-worker_1 | Forward Frames 1,2,3
mmseqs-web-worker_1 | Reverse Frames 1,2,3
mmseqs-web-worker_1 | Translation Table 1
mmseqs-web-worker_1 | Use all table starts false
mmseqs-web-worker_1 | Offset of numeric ids 0
mmseqs-web-worker_1 | Compressed 0
mmseqs-web-worker_1 | Add Orf Stop false
mmseqs-web-worker_1 | Overlap between sequences 0
mmseqs-web-worker_1 | Strand selection 1
mmseqs-web-worker_1 | Remove Temporary Files true
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | Time for processing: 0h 0m 0s 0ms
mmseqs-web-worker_1 | Database /opt/mmseqs-web/databases/test is a nucleotide database.
mmseqs-web-worker_1 | Please provide the parameter --search-type 2 (translated) or 3 (nucleotide)
mmseqs-web-worker_1 | 2019/03/11 12:41:29 Execution Error: exit status 1
Adding "index":"--search-type 3"
to the params
dictionary results in:
mmseqs-web-worker_1 | Program call:
mmseqs-web-worker_1 | createdb /opt/mmseqs-web/databases/test.fasta /opt/mmseqs-web/databases/test
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | MMseqs Version: efbd8d3b2f808c43c4e1629d8e74eb72cc8e92ba
mmseqs-web-worker_1 | Max. sequence length 65535
mmseqs-web-worker_1 | Split Seq. by len true
mmseqs-web-worker_1 | Database type 0
mmseqs-web-worker_1 | Do not shuffle input database true
mmseqs-web-worker_1 | Offset of numeric ids 0
mmseqs-web-worker_1 | Compressed 0
mmseqs-web-worker_1 | Verbosity 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | Assuming DNA database, forcing parameter --dont-split-seq-by-len true
mmseqs-web-worker_1 | .......Time for merging files: 0h 0m 0s 204ms
mmseqs-web-worker_1 | Time for merging files: 0h 0m 0s 563ms
mmseqs-web-worker_1 | Time for merging files: 0h 0m 0s 10ms
mmseqs-web-worker_1 | Time for processing: 0h 0m 3s 906ms
mmseqs-web-worker_1 | Program call:
mmseqs-web-worker_1 | createindex /opt/mmseqs-web/databases/test /tmp --remove-tmp-files true --check-compatible true --search-type 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | MMseqs Version: efbd8d3b2f808c43c4e1629d8e74eb72cc8e92ba
mmseqs-web-worker_1 | Seed Substitution Matrix PAM30.out
mmseqs-web-worker_1 | K-mer size 0
mmseqs-web-worker_1 | Alphabet size 21
mmseqs-web-worker_1 | Compositional bias 1
mmseqs-web-worker_1 | Max. sequence length 65535
mmseqs-web-worker_1 | Mask Residues 1
mmseqs-web-worker_1 | Spaced Kmer 1
mmseqs-web-worker_1 | Spaced k-mer pattern
mmseqs-web-worker_1 | Sensitivity 7.5
mmseqs-web-worker_1 | K-score 0
mmseqs-web-worker_1 | Check Compatible true
mmseqs-web-worker_1 | Search type 3
mmseqs-web-worker_1 | Split DB 0
mmseqs-web-worker_1 | Split Memory Limit 0
mmseqs-web-worker_1 | Threads 24
mmseqs-web-worker_1 | Verbosity 3
mmseqs-web-worker_1 | Min codons in orf 30
mmseqs-web-worker_1 | Max codons in length 98202
mmseqs-web-worker_1 | Max orf gaps 2147483647
mmseqs-web-worker_1 | Contig start mode 2
mmseqs-web-worker_1 | Contig end mode 2
mmseqs-web-worker_1 | Orf start mode 1
mmseqs-web-worker_1 | Forward Frames 1,2,3
mmseqs-web-worker_1 | Reverse Frames 1,2,3
mmseqs-web-worker_1 | Translation Table 1
mmseqs-web-worker_1 | Use all table starts false
mmseqs-web-worker_1 | Offset of numeric ids 0
mmseqs-web-worker_1 | Compressed 0
mmseqs-web-worker_1 | Add Orf Stop false
mmseqs-web-worker_1 | Overlap between sequences 0
mmseqs-web-worker_1 | Strand selection 1
mmseqs-web-worker_1 | Remove Temporary Files true
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | Program call:
mmseqs-web-worker_1 | splitsequence /opt/mmseqs-web/databases/test /tmp/9366001766242878652/nucl_split_seq --max-seq-len 65535 --sequence-overlap 0 -
-threads 24 --compressed 0 -v 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | MMseqs Version: efbd8d3b2f808c43c4e1629d8e74eb72cc8e92ba
mmseqs-web-worker_1 | Max. sequence length 65535
mmseqs-web-worker_1 | Overlap between sequences 0
mmseqs-web-worker_1 | Threads 24
mmseqs-web-worker_1 | Compressed 0
mmseqs-web-worker_1 | Verbosity 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | .......Time for merging files: 0h 0m 0s 21ms
mmseqs-web-worker_1 | Time for merging files: 0h 0m 0s 316ms
mmseqs-web-worker_1 | Time for processing: 0h 0m 0s 648ms
mmseqs-web-worker_1 | Program call:
mmseqs-web-worker_1 | extractframes /tmp/9366001766242878652/nucl_split_seq /tmp/9366001766242878652/nucl_split_seq_rev --forward-frames 1 --threads
24 --compressed 0 -v 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | MMseqs Version: efbd8d3b2f808c43c4e1629d8e74eb72cc8e92ba
mmseqs-web-worker_1 | Forward Frames 1
mmseqs-web-worker_1 | Reverse Frames 1,2,3
mmseqs-web-worker_1 | Threads 24
mmseqs-web-worker_1 | Compressed 0
mmseqs-web-worker_1 | Verbosity 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | .......Time for merging files: 0h 0m 0s 22ms
mmseqs-web-worker_1 | Time for merging files: 0h 0m 0s 315ms
mmseqs-web-worker_1 | Time for processing: 0h 0m 0s 686ms
mmseqs-web-worker_1 | Program call:
mmseqs-web-worker_1 | indexdb /tmp/9366001766242878652/nucl_split_seq_rev.dbtype /opt/mmseqs-web/databases/test --seed-sub-mat PAM30.out -k 0 --alph-
size 21 --comp-bias-corr 1 --max-seq-len 65535 --mask 1 --spaced-kmer-mode 1 -s 7.5 --k-score 0 --check-compatible 1 --search-type 3 --split 0 --split-me
mory-limit 0 --threads 24 -v 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | MMseqs Version: efbd8d3b2f808c43c4e1629d8e74eb72cc8e92ba
mmseqs-web-worker_1 | Seed Substitution Matrix PAM30.out
mmseqs-web-worker_1 | K-mer size 0
mmseqs-web-worker_1 | Alphabet size 21
mmseqs-web-worker_1 | Compositional bias 1
mmseqs-web-worker_1 | Max. sequence length 65535
mmseqs-web-worker_1 | Mask Residues 1
mmseqs-web-worker_1 | Spaced Kmer 1
mmseqs-web-worker_1 | Spaced k-mer pattern
mmseqs-web-worker_1 | Sensitivity 7.5
mmseqs-web-worker_1 | K-score 0
mmseqs-web-worker_1 | Check Compatible true
mmseqs-web-worker_1 | Search type 3
mmseqs-web-worker_1 | Split DB 0
mmseqs-web-worker_1 | Split Memory Limit 0
mmseqs-web-worker_1 | Threads 24
mmseqs-web-worker_1 | Verbosity 3
mmseqs-web-worker_1 |
mmseqs-web-worker_1 | Could not open index file /tmp/9366001766242878652/nucl_split_seq_rev.dbtype.index!
mmseqs-web-worker_1 | Error: indexdb died
mmseqs-web-worker_1 | 2019/03/11 12:46:21 Execution Error: exit status 1
UPDATE: using "index":"--search-type 2"
in params
works fine (forget the previous edit, my bad)
I tried to use the latest soedinglab/mmseqs2
docker image and run into the same issue when running:
docker run -v `pwd`:`pwd` -w `pwd` soedinglab/mmseqs2 mmseqs createindex databases/test /tmp --remove-tmp-files true --check-compatible true --search-type 3
Therefore this has nothing to do with the web server. Should I open a new issue on the main repo?
Yes, please to that. Sorry for the slow support. We have a deadline approaching soon :/
No problem, it's done. Thank you again! UPDATE: see soedinglab/MMseqs2#175
Hi, The issue with mmseqs has been solved and I have tested the nucleotide indexing and searching successfully. Could you please re-trigger the backend build? Version 9 should be good enough but I believe there was another commit related to bug fixes in the nucleotide search after that, so maybe latest is better ( or previous commit including at least 0e3fbac011481fd6291b92a0b48adce98fc0f007 , d9a44e89721ae3348246997bf1f009671ff58a83 , d1e25ae7f4e921c041022d93b69f16fc324339f9 ).
Thank you!
I tested the mmseqs executables from the latest mmseqs2 docker image with the web app (by copying them manually into the web app docker image using docker cp
) and I can confirm that it works now!
EDIT: it works only after setting the index
and search
params to --search-type 3 [--strand 2]
(strand only needed if you want to search both strand, that I believe should be made the default with nucleotide search). I think only setting index would work, as search is normally set to auto and should detect that from the index, but I haven't tested it. This is an example of #2 and it would be nice to have a full example of working nucleotide index in addition to the params specification.
Sorry for not rebuilding the containers sooner, once Docker Hub finishes you should have MMseqs2 r9 in them.
Also a caveat: The nucleotide-nucleotide search is still in development by @martin-steinegger There is no manuscript or exhaustive benchmarks yet.
No problem, that's great. I will keep an eye on the nt-nt search development :) I guess this issue can be closed, if you agree
Okay :) we’d be happy for feedback if you run into any issues with the nt-nt search.
Sure! I haven't yet, but I must say I only use it occasionally. Thanks again.
I've found this thread very useful in debugging problems with nucleotide searches. Thanks @simone-pignotti and @milot-mirdita
Now I have managed to have DNA and RNA databases up and running in mmseqs-app. However the search doesn't work and it looks purely like an API problem. The server logs seem to have quite happily run mmseqs.
This is my DNA and RNA databases config:
{
"databases": [
{
"name": "PDB DNA sequence (seqres)",
"version": "2020-02-18",
"path": "pdb_dna_sequence",
"default": false,
"order": 1,
"index": "--search-type 3",
"search": "--max-seqs 2000 --search-type 3"
},
{
"name": "PDB RNA sequence (seqres)",
"version": "2020-02-18",
"path": "pdb_rna_sequence",
"default": false,
"order": 2,
"index": "--search-type 3",
"search": "--max-seqs 2000 --search-type 3"
}
]
}
And what I get from API ticket endpoint (e.g. api/ticket/LwDKtIlhXr4oSa7w-zTpgwxHMXikC-FInHXmvg
) is:
{
"id": "LwDKtIlhXr4oSa7w-zTpgwxHMXikC-FInHXmvg",
"status": "COMPLETE"
}
So all looks ok there. However from result endpoint (e.g. api/result/LwDKtIlhXr4oSa7w-zTpgwxHMXikC-FInHXmvg/0
) I get this (with a 400 http return code):
record on line 3: wrong number of fields
Any ideas what can be wrong?
Please execute the following command from within the docker-compose
directory and upload the output.
head jobs/LwDKtIlhXr4oSa7w-zTpgwxHMXikC-FInHXmvg/alis_*
Can you please answer in a new issue so we don't spam simone's email with notifications?
I am using the docker version of the webserver, and I have successfully built and searched the example uniclust DB and another custom protein DB. I have also built a DNA DB, but searching it produces an error. Both the building and the searching logs are attached, and the issue seems to be a wrong path:
I believe the splitsequence command is supposed to take
mydb_nt
and notmydb_nt.idx
as argument, therefore it should be an easy fix.I hope this is useful, let me know if you can't reproduce the issue. Simone createdb.log search.log