EBI-Metagenomics / genomes-catalogue-pipeline

MGnify genome analysis pipeline
Other
98 stars 21 forks source link

Container image miss command `ps` #77

Open lam-c opened 9 months ago

lam-c commented 9 months ago

Hi, I caught an error in Process: GAP:MASH_TO_NWK.

# .command.err
Command error:
   Command 'ps' required by nextflow to collect task metrics cannot be found

# .command.run
nxf_launch() {
    set +u; env - PATH="$PATH" ${TMP:+SINGULARITYENV_TMP="$TMP"} \
${TMPDIR:+SINGULARITYENV_TMPDIR="$TMPDIR"} \
${NXF_TASK_WORKDIR:+SINGULARITYENV_NXF_TASK_WORKDIR="$NXF_TASK_WORKDIR"} \
SINGULARITYENV_NXF_DEBUG="${NXF_DEBUG:=0}" \
singularity exec --no-home --pid -B /home/user /home/user/singularity-images/[quay.io](http://quay.io/)-microbiome-informatics-genomes-pipeline.mash2nwk-v1.img /bin/bash -c "cd $PWD; eval $(nxf_container_env); /bin/bash /home/user/temp/work/1b/7f283388b4cb40d93918bcf18e0b97/.command.run nxf_trace"
}

I think the container genomes-pipeline.mash2nwk might lack an important tool ps. A possible solution could be included ps into the Dockerfile of mash2nwk (as below). Since docker was not installed on the system I worked on, I had some trouble on transforming dockerfile to singularity image. I would be so grateful if you could publish an updated container on quay.io.

# containers/mash2nwk/Dockerfile
FROM r-base:4.1.0

LABEL software="mash2nwk"
LABEL software.version="1.0.0"
LABEL description="Generate Mash distance tree of conspecific genomes"
LABEL website="https://github.com/EBI-Metagenomics/genomes-pipeline"
LABEL license="GPLv3"

# Line added, to make sure ps is available
RUN apt-get update && apt install -y procps g++ && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* 

RUN install2.r \
        reshape2 \
        fastcluster \
        optparse \
        data.table \
        ape

RUN mkdir /tools
COPY mash2nwk1.R /tools
RUN chmod a+x /tools/*
ENV PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools

# Workdir
RUN mkdir /data
WORKDIR /data

# Entrypoint
CMD ["Rscript", "/tools/mash2nwk1.R"]
mberacochea commented 9 months ago

@lam-c thanks for letting us know. We will fix the container and publish a new one, we will also review all the containers as this could also affect other ones.

lam-c commented 9 months ago

Thank you for considering. I tried to build singularity images locally, and it worked.

# mash2nwk.def 
bootstrap: docker
from: [quay.io/microbiome-informatics/genomes-pipeline.mash2nwk:v1](http://quay.io/microbiome-informatics/genomes-pipeline.mash2nwk:v1)

%post
    apt-get update
    apt-get install -y procps
    rm -rf /var/lib/apt/lists/*
$ singularity build --fakeroot quay.io-microbiome-informatics-genomes-pipeline.mash2nwk-v1.img mash2nwk.def

But soon after this, I met another error prompted by container GAP:AMRFINDER_PLUS (as below). It seems like the database download from ncbi (ftp://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/3.11/2023-02-23.1) lacking some AMRProt files (e.g. AMRProt.pdb, ...). I tried to update database as amrfinder suggested, then it runned as expected.

Running: amrfinder --plus -n MGYG000299305.fna -p MGYG000299305.faa -g MGYG000299305.gff -d /home/user/databases/AMRFinderPlus_2023-02-23.1 -a prokka --output MGYG000299305_amrfinderplus.tsv --threads 1
  Software directory: '/usr/local/bin/'
  Software version: 3.11.4
  WARNING: This was compiled for running under bioconda, but $CONDA_PREFIX was not found
  Reverting to hard coded directory: /opt/conda/conda-bld/ncbi-amrfinderplus_1678396575214/_build_env/share/amrfinderplus/data/latest
  Database directory: '/home/user/databases/AMRFinderPlus_2023-02-23.1'

  *** ERROR ***
  The BLAST database for AMRProt was not found. Use amrfinder -u to download and prepare database for AMRFinderPlus

  HOSTNAME: ?
  SHELL: ?
  PWD: /home/user/temp/work/27/2ab8a76f7a43c8690d4a4ba0ecf058
  PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/user/.nextflow/assets/EBI-Metagenomics/genomes-pipeline/bin
  Progam name:  amrfinder
  Command line: amrfinder --plus -n MGYG000299305.fna -p MGYG000299305.faa -g MGYG000299305.gff -d /home/user/databases/AMRFinderPlus_2023-02-23.1 -a prokka --output MGYG000299305_amrfinderplus.tsv --threads 1
KateSakharova commented 9 months ago

Hi @lam-c, container was modified in PR. I pushed it with version 1.1 (quay.io/microbiome-informatics/genomes-pipeline.mash2nwk:v1.1). Please, modify your pipeline installation here. Let me know if you have other problems! Kate

amardeepranu commented 9 months ago

@KateSakharova ran into the same error, updating it to v1.1 I get a different architecture mismatch error:

ERROR ~ Error executing process > 'GAP:MASH_TO_NWK (1)'

Caused by:
  Process `GAP:MASH_TO_NWK (1)` terminated with an error exit status (1)

Command executed:

  mash2nwk1.R -m MGYG000000018_mash.tsv

  mv trees/mashtree.nwk MGYG000000018_mash.nwk

Command exit status:
  1

Command output:
  (empty)

Command error:
  WARNING: The requested image's platform (linux/arm64) does not match the detected host platform (linux/amd64/v4) and no specific platform was requested
  exec /bin/bash: exec format error
amardeepranu commented 9 months ago

I also had the same issues with amrfinder as Iam-c

amardeepranu commented 9 months ago

@lam-c what was the amrfinder_plus_db path you used? Still can't get it running even with downloading amrfinder plus and updating the db..

lam-c commented 9 months ago

@amardeepranu I updated the database with amrfinder -u, and then parsed a custom config to the pipeline to bind the amrfinder_plus_db to the container.

# custom config
process {
    withName: AMRFINDER_PLUS {
        containerOptions = "--bind ${params.amrfinder_plus_db}"
    }
}

# command to run the pipeline
$ nextflow run EBI-Metagenomics/genomes-pipeline -c custom.config ...
lam-c commented 9 months ago

Hi @lam-c, container was modified in PR. I pushed it with version 1.1 (quay.io/microbiome-informatics/genomes-pipeline.mash2nwk:v1.1). Please, modify your pipeline installation here. Let me know if you have other problems! Kate

Many thanks ~

akques commented 9 months ago

@KateSakharova ran into the same error as @amardeepranu after I updated mash2nwk to v1.1, using docker container

exec format error

mberacochea commented 9 months ago

Hi @amardeepranu and @akques

The problem was that the container was built in a mac with an M2 chip (different architecture), I just replaced the image in quay with a version built for "linux/amd64" so it should work now. Please remove the image and pull it again.

I will run the whole pipeline with the Nextflow with the "-with-report" flag to check if all the containers are working with it. I'll post an update in the next few days.

Cheers