oschwengers / bakta

Rapid & standardized annotation of bacterial genomes, MAGs & plasmids
GNU General Public License v3.0
420 stars 47 forks source link

Downgrading dependencies is needed when installing bakta from conda #283

Closed LeonCharlesTranchevent closed 1 month ago

LeonCharlesTranchevent commented 3 months ago

Issue I installed bakta v1.8.2 via conda, the installation was successful but bakta was not operational (it fails and provides an error message - see below). I compared that environment with an older environment (same bakta version) that runs fine and realized that it might be related to some dependencies.

Command, log and error message

$ bakta --db /nfs/data/db/baktadb_v5/db ~/test.fna --meta --keep-contig-headers --skip-trna --skip-tmrna --skip-rrna --skip-ncrna --skip-ncrna-region --skip-crispr --skip-gap --skip-ori --threads 2 --output ~/ --prefix test_bakta --force 

parse genome sequences... 
        imported: 1 
        filtered & revised: 1 
        chromosomes: 1 

start annotation... 
skip tRNA prediction... 
skip tmRNA prediction... 
skip rRNA prediction... 
skip ncRNA prediction... 
skip ncRNA region prediction... 
skip CRISPR array prediction... 
predict & annotate CDSs... 
        predicted: 4365 
        discarded spurious: 8 
        revised translational exceptions: 0 
        detected IPSs: 4264 
Traceback (most recent call last): 
  File "/nfs/conda/envs/testtoremove/bin/bakta", line 10, in <module> 
    sys.exit(main()) 
  File "/nfs/conda/envs/testtoremove/lib/python3.10/site-packages/bakta/main.py", line 253, in main 
    cdss_psc, cdss_pscc, cdss_not_found = psc.search(cdss_not_found) 
  File "/nfs/conda/envs/testtoremove/lib/python3.10/site-packages/bakta/psc.py", line 64, in search 
    raise Exception(f'diamond error! error code: {proc.returncode}') 
Exception: diamond error! error code: -11 

Solution After a few trials and errors, I downgraded diamond from v2.1.9 to v2.1.8 and ncbi-amrfinderplus from v3.12.8 to v3.11.26 and could successfully use bakta v1.8.2. I did not check all versions in between for ncbi-amrfinderplus.

Note The error was reproduced with different input files - file content does not appear to matter much.

The problem seems to also extend the latest version (v1.9.3 as of last week), although I did not fully test that version (and therefore did not identify which downgrades were required).

mmcguffi commented 3 months ago

I also ran into this with Bakta 1.6.1 via conda

Building on what @LeonCharlesTranchevent said (thanks for documenting!), I also had to pin biopython -- here is a pinned yaml that worked for me:

name: annotation
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - bakta=1.6.1
  - diamond=2.1.8
  - ncbi-amrfinderplus=3.11.26
  - biopython=1.81

I also had to update the AMRFinderPluss internal database, here is that cmd for convenience:

amrfinder_update \
    --force_update \
    --database ${database_loc}/db/amrfinderplus-db

And of course, thank you @oschwengers for creating such a great tool

oschwengers commented 3 months ago

Hi and thanks for reporting both issues and potential solutions!

There is an issue with Diamond in v2.1.9 and downgrading to v2.1.8should do the trick until there is an upstream patch for Diamond.

Regarding AMRFinderplus: They recently updated its database scheme. Best you can do is to update both AMRFinderplus and its database as @mmcguffi suggested.

@mmcguffi Is there a specific issue with BioPython? If so, could you please open a dedicated issue for that? I wasn't aware of that.

Again, thanks for reporting and documenting!