oschwengers / bakta

Rapid & standardized annotation of bacterial genomes, MAGs & plasmids
GNU General Public License v3.0
448 stars 55 forks source link

[ERROR: Pyrodigal could not be executed! Please make sure Pyrodigal is installed and executable or skip requiring workflow steps via via '--skip-cds'.](https://confluence.ncbi.nlm.nih.gov/pages/viewpage.action?pageId=354897706) #239

Closed azat-badretdin closed 1 year ago

azat-badretdin commented 1 year ago

Describe the bug pyrodigal does not work:

[ERROR: Pyrodigal could not be executed! Please make sure Pyrodigal is installed and executable or skip requiring workflow steps via via '--skip-cds'.](https://confluence.ncbi.nlm.nih.gov/pages/viewpage.action?pageId=354897706)

Therefore, please provide us with at least the following information:

azat-badretdin commented 1 year ago

More info:

I see pyrodigal installed as a tiny Python wrapper around pyrodigal.cli module under /bin and the command

pyrodigal --help

works just fine

oschwengers commented 1 year ago

Hi @azat-badretdin, Unfortunately, I cannot reproduce this error:

$ mamba create -p ./conda-test-env bakta=1.8.2
$ mamba activate conda-test-env/
$ bakta --version
bakta 1.8.2
$ bakta --debug --db databases/bakta/db-v5.0-light/ --output test-bakta-pyrodigal/ GCF_000008865.2.fna.gz
Bakta v1.8.2
Options and arguments:
    input: ..../GCF_000008865.2.fna.gz
    db: .../databases/bakta/db-v5.0-light, version 5.0, light
    output: .../test-bakta-pyrodigal

Could you provde the exact commands (also from the creation of the conda env).

azat-badretdin commented 1 year ago

I am not using mamba we do not have it yet installed organization-wide.

conda create -p ./conda-test-env -c conda-forge -c bioconda bakta=1.8.2
conda activate ./conda-test-env
+ export PATH=/usr/bin:/netmnt/vast01/gp/ThirdParty/tRNAscan-SE/2.0.12/bin/:/netmnt/vast01/gp/home/badrazat/jira/2023/July/PGAP-7485-Try-Bakta/conda-test-env/bin:/home/badrazat/perl5/bin:/opt/fcron/bin:/usr/local/shellcheck/0.7.2/bin:/usr/local/prokka/1.14.5/bin:/usr/local/anaconda/3-2021.05/condabin:/usr/local/anaconda/3-2021.05/bin:/usr/local/diamond/2.0.15/bin:/usr/local/rg/13.0.0/bin:/usr/local/visidata/2.6/bin:/usr/local/prodigal/2.6.3/bin:/usr/local/pplacer/1.1a18/bin:/usr/local/singularity/2.4.5/bin:/usr/local/maven/3.3.9/bin:/opt/python-all/bin:/usr/local/llvm/13.0.0/bin:/usr/local/perl/5.16.3/bin:/opt/perl/5.16.3/bin:/usr/local/uclust/1.2.22/bin:/usr/local/repeatmasker/4.0.7/bin:/usr/local/trf/409/bin:/usr/local/doxygen/1.9.0/bin:/usr/local/xmldiff/2.3/bin:/usr/local/R/4.0.1/bin:/usr/local/ccache/4.4/bin:/usr/local/netcdf/4.3.3/bin:/usr/local/subversion/1.10.6/bin:/usr/local/svnmucc/1.5.7/bin:/usr/local/ninja/1.10.2/bin:/usr/local/nedit/5.5/bin:/netopt/ncbi_tools64/bin:/am/ncbiapdata/bin:/usr/local/joe/3.7/bin:/usr/local/git/2.38.3/bin:/usr/local/ddd/3.3.12/bin:/usr/local/ctags/5.8/bin:/opt/ncbi/gcc/7.3.0/bin:/usr/local/kcachegrind/0.7.4/bin:/usr/local/infernal/1.1.1/bin:/usr/local/gnuplot/5.0.1/bin:/usr/local/fasttree/2.1.4/bin:/usr/local/samtools/1.14/bin:/usr/local/xemacs/21.4.22/bin:/usr/local/grace/5.1.22/bin:/usr/local/gsl/1.12/bin:/usr/local/ViennaRNA/2.6.3/bin:/usr/local/unafold/3.9a/bin:/usr/local/mfold_util/4.6/bin:/netopt/genbank/subtool/bin:/usr/local/vim/8.2-static-python39/bin:/opt/panfs/bin:/netmnt/gridengine/current/bin/lx-amd64:/usr/local/muscle/3.8.31/bin:/net/snowman/vol/projects/trace_software/vdb/linux/release/x86_64/bin:/usr/local/hmmer/3.1b2/bin:/usr/local/toolworks/totalview.2019.1.4/bin:/usr/local/mafft/7.475/bin:/usr/local/clustalx/1.83/bin:/opt/sybase/clients/current/bin:/opt/sybase/utils/bin:/usr/local/emacs/27.1/bin:/usr/local/valgrind/3.20/bin:/usr/local/cmake/3.21.2/bin:/usr/local/graphviz/2.40.1/bin:/usr/local/gdb/10.2/bin:/usr/local/atom/1.4.1/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/sybase/clients/current/bin:/opt/sybase/utils/bin:/opt/puppetlabs/bin:/opt/dell/srvadmin/bin:/home/badrazat/bin
+ PATH=/usr/bin:/netmnt/vast01/gp/ThirdParty/tRNAscan-SE/2.0.12/bin/:/netmnt/vast01/gp/home/badrazat/jira/2023/July/PGAP-7485-Try-Bakta/conda-test-env/bin:/home/badrazat/perl5/bin:/opt/fcron/bin:/usr/local/shellcheck/0.7.2/bin:/usr/local/prokka/1.14.5/bin:/usr/local/anaconda/3-2021.05/condabin:/usr/local/anaconda/3-2021.05/bin:/usr/local/diamond/2.0.15/bin:/usr/local/rg/13.0.0/bin:/usr/local/visidata/2.6/bin:/usr/local/prodigal/2.6.3/bin:/usr/local/pplacer/1.1a18/bin:/usr/local/singularity/2.4.5/bin:/usr/local/maven/3.3.9/bin:/opt/python-all/bin:/usr/local/llvm/13.0.0/bin:/usr/local/perl/5.16.3/bin:/opt/perl/5.16.3/bin:/usr/local/uclust/1.2.22/bin:/usr/local/repeatmasker/4.0.7/bin:/usr/local/trf/409/bin:/usr/local/doxygen/1.9.0/bin:/usr/local/xmldiff/2.3/bin:/usr/local/R/4.0.1/bin:/usr/local/ccache/4.4/bin:/usr/local/netcdf/4.3.3/bin:/usr/local/subversion/1.10.6/bin:/usr/local/svnmucc/1.5.7/bin:/usr/local/ninja/1.10.2/bin:/usr/local/nedit/5.5/bin:/netopt/ncbi_tools64/bin:/am/ncbiapdata/bin:/usr/local/joe/3.7/bin:/usr/local/git/2.38.3/bin:/usr/local/ddd/3.3.12/bin:/usr/local/ctags/5.8/bin:/opt/ncbi/gcc/7.3.0/bin:/usr/local/kcachegrind/0.7.4/bin:/usr/local/infernal/1.1.1/bin:/usr/local/gnuplot/5.0.1/bin:/usr/local/fasttree/2.1.4/bin:/usr/local/samtools/1.14/bin:/usr/local/xemacs/21.4.22/bin:/usr/local/grace/5.1.22/bin:/usr/local/gsl/1.12/bin:/usr/local/ViennaRNA/2.6.3/bin:/usr/local/unafold/3.9a/bin:/usr/local/mfold_util/4.6/bin:/netopt/genbank/subtool/bin:/usr/local/vim/8.2-static-python39/bin:/opt/panfs/bin:/netmnt/gridengine/current/bin/lx-amd64:/usr/local/muscle/3.8.31/bin:/net/snowman/vol/projects/trace_software/vdb/linux/release/x86_64/bin:/usr/local/hmmer/3.1b2/bin:/usr/local/toolworks/totalview.2019.1.4/bin:/usr/local/mafft/7.475/bin:/usr/local/clustalx/1.83/bin:/opt/sybase/clients/current/bin:/opt/sybase/utils/bin:/usr/local/emacs/27.1/bin:/usr/local/valgrind/3.20/bin:/usr/local/cmake/3.21.2/bin:/usr/local/graphviz/2.40.1/bin:/usr/local/gdb/10.2/bin:/usr/local/atom/1.4.1/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/sybase/clients/current/bin:/opt/sybase/utils/bin:/opt/puppetlabs/bin:/opt/dell/srvadmin/bin:/home/badrazat/bin

+ pyrodigal --help
usage: pyrodigal [-a trans_file] [-c] [-d nuc_file] [-f output_type] [-g tr_table] -i input_file [-m] [-n] [-o output_file] [-p mode] [-s start_file]
                 [-t training_file] [-j jobs] [-h] [-V] [--min-gene MIN_GENE] [--min-edge-gene MIN_EDGE_GENE] [--max-overlap MAX_OVERLAP]

options:
  -a trans_file         Write protein translations to the selected file.
  -c                    Closed ends. Do not allow genes to run off edges.
  -d nuc_file           Write nucleotide sequences of genes to the selected file.
  -f output_type        Select output format.
  -g tr_table           Specify a translation table to use.
  -i input_file         Specify FASTA input file.
  -m                    Treat runs of N as masked sequence; don't build genes across them.
  -n                    Bypass Shine-Dalgarno trainer and force a full motif scan.
  -o output_file        Specify output file.
  -p mode               Select procedure.
  -s start_file         Write all potential genes (with scores) to the selected file.
  -t training_file      Write a training file (if none exists); otherwise, read and use the specified training file.
  -j jobs, --jobs jobs  The number of threads to use if input contains multiple sequences.
  -h, --help            Show this help message and exit.
  -V, --version         Show version number and exit.
  --min-gene MIN_GENE   The minimum gene length.
  --min-edge-gene MIN_EDGE_GENE
                        The minimum edge gene length.
  --max-overlap MAX_OVERLAP
                        The maximum number of nucleotides that can overlap between two genes on the same strand. This must be lower or equal to the minimum
                        gene length.

+ bakta --db /netmnt/vast01/gp/home/badrazat/jira/2023/July/PGAP-7485-Try-Bakta/db-light --verbose --debug /tmp/test-assm.sh.4eFzA6UADOK/240958.fasta
ERROR: Pyrodigal could not be executed! Please make sure Pyrodigal is installed and executable or skip requiring workflow steps via via '--skip-cds'.

Note, I had to prepend PATH with /usr/bin:/netmnt/vast01/gp/ThirdParty/tRNAscan-SE/2.0.12/bin/: so I could use my perl and my trnascan installation.

azat-badretdin commented 1 year ago

If I do not hack PATH, I get:

$ tRNAscan-SE --help
perl: symbol lookup error: /home/badrazat/perl5/lib/perl5/x86_64-linux-thread-multi/auto/List/Util/Util.so: undefined symbol: Perl_xs_version_bootcheck

Looks like some of my PERL settings overrides PERL setup inside bakta installation?

azat-badretdin commented 1 year ago

I think the root problem here is the PERL problem, not this one?

WOuld you like me to resolve this and open a new issue on PERL?

oschwengers commented 1 year ago

I'm not sure what exactly is the root cause of these errors. But I have the feeling that there are some basic issues within your environment and/or conda setup that are not related to Bakta. Also, using a conda environment and adding custom PATH variable is certainly not a good idea, because this could cause severe side effects. And since tRNAscan-SE is a requirement of Bakta, it should be installed automatically by Conda anyways.

Have you tried the Docker/Podman containers? Maybe this is a better option in your case?

azat-badretdin commented 1 year ago

But I have the feeling that there are some basic issues within your environment and/or conda setup that are not related to Bakta

Agreed.

Have you tried the Docker/Podman containers? Maybe this is a better option in your case?

That is one of the choices, yes. My plan is to go tutti on this problem by using our relatively fresh AWS pets, which are less clogged by user settings (this would be like Docker/Podman solution on steroids)

csernab commented 1 year ago

I get the same error when upgrading bakta to version 1.8.2.

ERROR: Pyrodigal could not be executed! Please make sure Pyrodigal is installed and executable or skip requiring workflow steps via via '--skip-cds'.

oschwengers commented 1 year ago

Hi @csernab, if you've installed Bakta via Conda, could you please just update the Bakta installation within your Conda environment. We've recently fixed the Pyrodigal dependency specification for the Bakta conda package.

azat-badretdin commented 1 year ago

@csernab also: for me the best idea was to to work around this, using docker bakta on a clean cloud host.

When user environment is complicated like in my case, Conda could stumble on things.

oschwengers commented 1 year ago

@azat-badretdin @csernab I guess this error was caused by false/no-up-to-date Conda environments and is not directly caused by Bakta itself. Hence, if this is not still active, I'll close this for now. Please, do not hesitate to re-open it in any case.