soedinglab / MMseqs2-App

MMseqs2 app to run on your workstation or servers
https://search.foldseek.com
GNU General Public License v3.0
61 stars 18 forks source link

jobs stuck in PENDING status with local mmseqs-web API #85

Open reyjul opened 11 months ago

reyjul commented 11 months ago

Hello,

I'm trying to make the mmseqs-web API work but I'm encountering several issues.

This is the Dockerfile I used to build the API:

FROM --platform=linux/amd64 golang:latest as builder
ARG TARGETARCH

WORKDIR /opt/build
ADD backend .
RUN GOOS=linux GOARCH=$TARGETARCH go build -o mmseqs-web

ADD https://mmseqs.com/latest/mmseqs-linux-avx2.tar.gz  .

ADD https://mmseqs.com/foldseek/foldseek-linux-avx2.tar.gz  .

ADD https://raw.githubusercontent.com/soedinglab/MMseqs2/678c82ac44f1178bf9a3d49bfab9d7eed3f17fbc/util/mmseqs_wrapper.sh binaries/mmseqs
ADD https://raw.githubusercontent.com/steineggerlab/foldseek/0a68e16214a6db745cee783128ccba8546ea5dc9/util/foldseek_wrapper.sh binaries/foldseek

RUN mkdir binaries; \
    if [ "$TARGETARCH" = "arm64" ]; then \
      for i in mmseqs foldseek; do \
        if [ -e "${i}-linux-arm64.tar.gz" ]; then \
          cat ${i}-linux-arm64.tar.gz | tar -xzvf- ${i}/bin/${i}; \
          mv ${i}/bin/${i} binaries/${i}; \
        fi; \
      done; \
    else \
      for i in mmseqs foldseek; do \
        for j in sse2 sse41 avx2; do \
          if [ -e "${i}-linux-${j}.tar.gz" ]; then \
            cat ${i}-linux-${j}.tar.gz | tar -xzvf- ${i}/bin/${i}; \
            mv ${i}/bin/${i} binaries/${i}_${j}; \
          fi; \
        done; \
      done; \
    fi;

RUN chmod -R +x binaries

FROM debian:stable-slim
LABEL maintainer="Milot Mirdita <milot@mirdita.de>"

RUN apt-get update && apt-get install -y ca-certificates wget aria2 && rm -rf /var/lib/apt/lists/*
COPY --from=builder /opt/build/mmseqs-web /opt/build/binaries/* /usr/local/bin/

ENTRYPOINT ["/usr/local/bin/mmseqs-web"]

I then installed the databanks and created the indexes the usual way:

mmseqs databases UniRef50 UniRef50 tmp --remove-tmp-files
mmseqs createindex UniRef50 tmp --split 1

and added the params files along the banks in the same directory (/local/banks):

{
  "name": "UniRef50",
  "path": "UniRef50",
  "version": "",
  "default": true,
  "order": 0,
  "index": "",
  "search": "",
  "status": "COMPLETE"
}

This is how I launch the API:

singularity exec --env MMSEQS_NUM_THREADS=2 --bind /local/banks:/local/banks /shared/software/singularity/images/mmseqs2-app-v7-8e1704f-rpbs.sif /usr/local/bin/mmseqs-web -local -config config.json -app mmseqs

This is the content of the config.json file:

{
    "app": "mmseqs",
    "verbose": true,
    "server" : {
        "address"    : "0.0.0.0:3000",
        "dbmanagment": false,
        "cors"       : true
    },
    "worker": {
        "gracefulexit" : true
    },
    "paths" : {
        "databases"    : "/local/banks/",
        "results"      : "/shared/home/rey/colabfold",
        "temporary"    : "/tmp",
        "colabfold"    : {
            "uniref"        : "/local/banks/UniRef50"
        },
        "mmseqs"       : "/usr/local/bin/mmseqs",
        "foldseek"     : "/usr/local/bin/foldseek"
    },
    "redis" : {
        "network"  : "tcp",
        "address"  : "mmseqs-web-redis:6379",
        "password" : "",
        "index"    : 0
    },
    "mail" : {
        "type"      : "null",
        "sender"    : "mail@example.org",
        "templates" : {
            "success" : {
                "subject" : "Done -- %s",
                "body"    : "Dear User,\nThe results of your submitted job are available now at https://search.mmseqs.com/queue/%s .\n"
            },
            "timeout" : {
                "subject" : "Timeout -- %s",
                "body"    : "Dear User,\nYour submitted job timed out. More details are available at https://search.mmseqs.com/queue/%s .\nPlease adjust the job and submit it again.\n"
            },
            "error"   : {
                "subject" : "Error -- %s",
                "body"    : "Dear User,\nYour submitted job failed. More details are available at https://search.mmseqs.com/queue/%s .\nPlease submit your job later again.\n"
            }
        }
    }
}

I get a response with curl which seems to indicate that the API is running and listening on correct port (3000):

curl -X GET http://10.0.1.246:3000/databases
{"databases":[{"name":"UniRef50","version":"","path":"UniRef50","default":true,"order":0,"taxonomy":false,"full_header":false,"index":"","search":"","status":"COMPLETE"},{"name":"UniRef30","version":"2103","path":"UniRef30","default":false,"order":1,"taxonomy":false,"full_header":false,"index":"","search":"","status":"COMPLETE"}]}

On a side note, I can't list databases if I the status in the params file is different from COMPLETE.

If I try to submit a sequence with python:

>>> from requests import get, post
>>> ticket = post('http://10.0.1.246:3000/ticket', {
...             'q' : '>FASTA\nMPKIIEAIYENGVFKPLQKVDLKEGE\n',
...             'database[]' : ["UniRef50"],
...             'mode' : 'all',
...         }).json()
>>> ticket
{'id': 'A5n_NyrysSRtH7tNN6uuYdS6LFkv2bhK3Z94IA', 'status': 'PENDING'}

The directory containing the job is correctly created. But then nothing happens, the jobs stays forever in PENDING state.

Trying to get job status after a few hours, nothing seems to happen either:

>>> status = get('http://10.0.1.246:3000/ticket/' + ticket['id']).json()
>>> status
{'id': 'A5n_NyrysSRtH7tNN6uuYdS6LFkv2bhK3Z94IA', 'status': 'PENDING'}

Any idea / advice are welcome.

milot-mirdita commented 11 months ago

Please try adding -local.workers 1 to the command line call.

There should be a local settings in the config file:

"local" : {
        "workers"  : 1
}

I guess when that part is missing it will get initialized to 0 by default and not start any worker?

If this was not the issue, can you post the output that the singularity process generated?

reyjul commented 11 months ago

Great, that works.

I also had to modify this line in the Dockerfile to avoid "Permission denied" issues:

RUN chmod -R +rx binaries

And the .params files have to be writeable by all or I would get this message from the API:

Execution Error: open /local/banks/UniRef50.params: permission denied

Thanks a lot.