BioContainers / containers

Bioinformatics containers
http://biocontainers.pro
Apache License 2.0
674 stars 246 forks source link

prokka error #170

Closed brooksph closed 6 years ago

brooksph commented 7 years ago

Hi! This may be a prokka error but I can't get the conatiner to run on ubuntu 16.04. I'm getting the following error [05:00:26] Looking for 'blastp' - found /usr/local/bin/blastp blastp: error while loading shared libraries: libbz2.so.1: cannot open shared object file: No such file or directory [05:00:26] Could not determine version of blastp - please install version 2.2 or higher

prvst commented 7 years ago

Hey @brooksph; Can you tell me the name of the container you are using ? (the command you used to download the container will serve too). We currently work with 2 different registries, so that can help us pinpoint your problem.

brooksph commented 7 years ago

Thanks for the quick response. I'm using the prokka container and the following command to download it docker pull quay.io/biocontainers/prokka:1.12--1.

This is the full command and error response.

docker run -v /home/ubuntu/data:/data -it quay.io/biocontainers/prokka:1.12--1 prokka /data/podar_metaG_sub_10/final.contigs.fa --outdir /data/prokka_annotation --prefix podar_metaG_sub_10 
[00:03:36] This is prokka 1.12
[00:03:36] Written by Torsten Seemann <torsten.seemann@gmail.com>
[00:03:36] Homepage is https://github.com/tseemann/prokka
[00:03:36] Local time is Mon Aug  7 00:03:36 2017
[00:03:36] You are not telling me who you are!
[00:03:36] Operating system is linux
[00:03:36] You have BioPerl 1.006924
[00:03:36] System has 2 cores.
[00:03:36] Option --cpu asked for 8 cores, but system only has 2
[00:03:36] Will use maximum of 2 cores.
[00:03:36] Annotating as >>> Bacteria <<<
[00:03:36] Generating locus_tag from '/data/podar_metaG_sub_10/final.contigs.fa' contents.
[00:03:37] Setting --locustag APGGHMMN from MD5 a9001667e0f9471de40bd1b64666ec71
[00:03:37] Re-using existing --outdir /data/prokka_annotation
[00:03:37] Using filename prefix: podar_metaG_sub_10.XXX
[00:03:37] Setting HMMER_NCPU=1
[00:03:37] Writing log to: /data/prokka_annotation/podar_metaG_sub_10.log
[00:03:37] Command: /usr/local/bin/prokka /data/podar_metaG_sub_10/final.contigs.fa --outdir /data/prokka_annotation --prefix podar_metaG_sub_10 --force
[00:03:37] Appending to PATH: /usr/local/bin/../binaries/linux
[00:03:37] Appending to PATH: /usr/local/bin
[00:03:37] Looking for 'aragorn' - found /usr/local/bin/aragorn
[00:03:37] Determined aragorn version is 1.2
[00:03:37] Looking for 'barrnap' - found /usr/local/bin/barrnap
[00:03:37] Determined barrnap version is 0.7
[00:03:37] Looking for 'blastp' - found /usr/local/bin/blastp
blastp: error while loading shared libraries: libz.so.1: cannot open shared object file: No such file or directory
[00:03:37] Could not determine version of blastp - please install version 2.2 or higher
prvst commented 7 years ago

It looks to me that it could be an error related to the tool itself, and not the container. @bgruening What do you think ?

Maybe @tseemann can gives us a little help here too.

prvst commented 7 years ago

@brooksph we have to check if BLAST comes with the prokka installation or if the tool expect to find it in path.

bgruening commented 7 years ago

@brooksph can you try this version?

% docker run -it quay.io/biocontainers/prokka:1.12--2 prokka --help    
Unable to find image 'quay.io/biocontainers/prokka:1.12--2' locally
1.12--2: Pulling from biocontainers/prokka
a3ed95caeb02: Already exists 
4c1fa756c345: Already exists 
a7f760de4b27: Already exists 
d836c29a56fb: Already exists 
6c2ebb6634fc: Already exists 
00f810677cff: Already exists 
531ebc5af9ff: Already exists 
aef3b3b2fa0d: Already exists 
ea46eaf5405b: Pull complete 
Digest: sha256:49c4ef349676aba65d1dcc1a14ddd29b865ced468db8f5d44abd54bd384421c7
Status: Downloaded newer image for quay.io/biocontainers/prokka:1.12--2
Name:
  Prokka 1.12 by Torsten Seemann <torsten.seemann@gmail.com>
Synopsis:
  rapid bacterial genome annotation
Usage:
  prokka [options] <contigs.fasta>
General:
  --help            This help
  --version         Print version and exit
  --docs            Show full manual/documentation
  --citation        Print citation for referencing Prokka
  --quiet           No screen output (default OFF)
  --debug           Debug mode: keep all temporary files (default OFF)
Setup:
  --listdb          List all configured databases
  --setupdb         Index all installed databases
  --cleandb         Remove all database indices
  --depends         List all software dependencies
Outputs:
  --outdir [X]      Output folder [auto] (default '')
  --force           Force overwriting existing output folder (default OFF)
  --prefix [X]      Filename output prefix [auto] (default '')
  --addgenes        Add 'gene' features for each 'CDS' feature (default OFF)
  --addmrna         Add 'mRNA' features for each 'CDS' feature (default OFF)
  --locustag [X]    Locus tag prefix [auto] (default '')
  --increment [N]   Locus tag counter increment (default '1')
  --gffver [N]      GFF version (default '3')
  --compliant       Force Genbank/ENA/DDJB compliance: --addgenes --mincontiglen 200 --centre XXX (default OFF)
  --centre [X]      Sequencing centre ID. (default '')
  --accver [N]      Version to put in Genbank file (default '1')
Organism details:
  --genus [X]       Genus name (default 'Genus')
  --species [X]     Species name (default 'species')
  --strain [X]      Strain name (default 'strain')
  --plasmid [X]     Plasmid name or identifier (default '')
Annotations:
  --kingdom [X]     Annotation mode: Archaea|Bacteria|Mitochondria|Viruses (default 'Bacteria')
  --gcode [N]       Genetic code / Translation table (set if --kingdom is set) (default '0')
  --gram [X]        Gram: -/neg +/pos (default '')
  --usegenus        Use genus-specific BLAST databases (needs --genus) (default OFF)
  --proteins [X]    FASTA or GBK file to use as 1st priority (default '')
  --hmms [X]        Trusted HMM to first annotate from (default '')
  --metagenome      Improve gene predictions for highly fragmented genomes (default OFF)
  --rawproduct      Do not clean up /product annotation (default OFF)
  --cdsrnaolap      Allow [tr]RNA to overlap CDS (default OFF)
Computation:
  --cpus [N]        Number of CPUs to use [0=all] (default '8')
  --fast            Fast mode - only use basic BLASTP databases (default OFF)
  --noanno          For CDS just set /product="unannotated protein" (default OFF)
  --mincontiglen [N] Minimum contig size [NCBI needs 200] (default '1')
  --evalue [n.n]    Similarity e-value cut-off (default '1e-06')
  --rfam            Enable searching for ncRNAs with Infernal+Rfam (SLOW!) (default '0')
  --norrna          Don't run rRNA search (default OFF)
  --notrna          Don't run tRNA search (default OFF)
  --rnammer         Prefer RNAmmer over Barrnap for rRNA prediction (default OFF)
brooksph commented 7 years ago

@bgruening Thanks for working on this! Now it's getting hung up on something else:

[01:57:10] Setting --locustag EEAFIFCJ from MD5 eeaf2fc3d185c42bd610c4308b8df299
[01:57:10] Re-using existing --outdir /data/prokka_annotation
[01:57:10] Using filename prefix: podar_metaG_sub_50.XXX
[01:57:10] Setting HMMER_NCPU=1
[01:57:10] Writing log to: /data/prokka_annotation/podar_metaG_sub_50.log
[01:57:10] Command: /usr/local/bin/prokka /data/megahit_output_podar_metaG_sub_50/final.contigs.fa --outdir /data/prokka_annotation --prefix podar_metaG_sub_50 --force
[01:57:10] Appending to PATH: /usr/local/bin/../binaries/linux
[01:57:10] Appending to PATH: /usr/local/bin
[01:57:10] Looking for 'aragorn' - found /usr/local/bin/aragorn
[01:57:10] Determined aragorn version is 1.2
[01:57:10] Looking for 'barrnap' - found /usr/local/bin/barrnap
[01:57:10] Determined barrnap version is 0.7
[01:57:10] Looking for 'blastp' - found /usr/local/bin/blastp
[01:57:10] Determined blastp version is 2.6
[01:57:10] Looking for 'cmpress' - found /usr/local/bin/cmpress
[01:57:10] Determined cmpress version is 1.1
[01:57:10] Looking for 'cmscan' - found /usr/local/bin/cmscan
[01:57:10] Determined cmscan version is 1.1
[01:57:10] Looking for 'egrep' - found /bin/egrep
[01:57:10] Looking for 'find' - found /usr/bin/find
[01:57:10] Looking for 'grep' - found /bin/grep
[01:57:10] Looking for 'hmmpress' - found /usr/local/bin/hmmpress
[01:57:10] Determined hmmpress version is 3.1
[01:57:10] Looking for 'hmmscan' - found /usr/local/bin/hmmscan
[01:57:10] Determined hmmscan version is 3.1
[01:57:10] Looking for 'java' - found /usr/local/bin/java
[01:57:10] Looking for 'less' - found /usr/bin/less
[01:57:10] Looking for 'makeblastdb' - found /usr/local/bin/makeblastdb
[01:57:10] Determined makeblastdb version is 2.6
[01:57:10] Looking for 'minced' - found /usr/local/bin/minced
[01:57:10] Determined minced version is 2.0
[01:57:10] Looking for 'parallel' - found /usr/local/bin/parallel
[01:57:10] Determined parallel version is 20170422
[01:57:10] Looking for 'prodigal' - found /usr/local/bin/prodigal
[01:57:10] Determined prodigal version is 2.6
[01:57:10] Looking for 'prokka-genbank_to_fasta_db' - found /usr/local/bin/prokka-genbank_to_fasta_db
[01:57:10] Looking for 'sed' - found /bin/sed
[01:57:10] Looking for 'tbl2asn' - found /usr/local/bin/tbl2asn
tbl2asn: error while loading shared libraries: libidn.so.11: cannot open shared object file: No such file or directory
[01:57:10] Could not determine version of tbl2asn - please install version 24.3 or higher

I see a similar issue in the prokka repo from about a year ago https://github.com/tseemann/prokka/issues/198.

tseemann commented 7 years ago

Is your blast recipe missing a dep on libbz2 ?

Or if you don't have blastp in your PATH, it will fall back to the bundled "mostly static" binary I provide, which also needs /lib64/libbz2.so and others:

ldd git/prokka/binaries/linux/blastp
        linux-vdso.so.1 =>  (0x00007fff3b0aa000)
        libz.so.1 => /lib64/libz.so.1 (0x00007fc067956000)
        libbz2.so.1 => /lib64/libbz2.so.1 (0x00007fc067746000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x00007fc06752d000)
        librt.so.1 => /lib64/librt.so.1 (0x00007fc067325000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fc067023000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fc066e07000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fc066a44000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fc06682e000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fc067b6c000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fc06662a000)

My bundled binaries are a last resort. Most are static, but not all. They work on most distributions, but Docker containers don't have most of the "common" libs by default of course.

tbl2asn is a bigger problem. It is only available from NCBI as a "mostly" static binary but it needs libidn.so yes. It is libidn-1.28-4.el7.x86_64 on RHEL.

 ldd git/prokka/binaries/linux/tbl2asn
        linux-vdso.so.1 =>  (0x00007ffde610e000)
        libidn.so.11 => /lib64/libidn.so.11 (0x00007f2da2c0f000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f2da29f9000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f2da26f7000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f2da2334000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f2da2e42000)

Also, `tbl2asn' expires every 6 months and you need to download a new version. NCBI do not announce new versions.

bgruening commented 7 years ago

@tseemann how do you test this inside prokka start? Can we add the same tests to the BioConda recipe to catch this?

bgruening commented 7 years ago

Add the missing runtime dep: https://github.com/bioconda/bioconda-recipes/pull/5505 It would be nice to test this before I merge and a simple prokka help/depends is not enough :(

tseemann commented 7 years ago

@bgruening do you mean how do I test if tbl2asn is working or not?
I don't! You can't.... users just email saying it is locking up so we update tbl2asn.

I think Bioconda should immediately rm -fr $PREFIX/binaries folder so it has no chance to rely on my bundled binaries.

kastman commented 6 years ago

I'd like to hop on this issue and help out. From within docker run -it quay.io/biocontainers/prokka:1.12--3 bash (after the patch above adding libidn as a requirement), I see that some part of libidn does appear to be partially installed:

$ find / -name 'libidn*'
/usr/local/share/* # a bunch of locale docs
/usr/local/lib/libidn.la
/usr/local/lib/pkgconfig/libidn.pc
/usr/local/lib/libidn.a

I'm not familiar with the whole bioconda / biocontainer pipeline, but I surmise that there's a mechanism to build the container directly from the recipe, right, as evidenced by #5505 from @bgruening above? For whatever reason, this isn't adding in the libidn.so in the form required by tbl2asn:

bash-4.2# ./tbl2asn                  
./tbl2asn: error while loading shared libraries: libidn.so.11: cannot open shared object file: No such file or directory

bash-4.2# ldd tbl2asn                
/lib64/ld-linux-x86-64.so.2 (0x7f4a1e5b5000)
Error loading shared library libidn.so.11: No such file or directory (needed by tbl2asn)
libz.so.1 => /usr/local/lib/libz.so.1 (0x7f4a1e398000)
libm.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7f4a1e5b5000)
libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7f4a1e5b5000)
Error relocating tbl2asn: idna_strerror: symbol not found
Error relocating tbl2asn: __register_atfork: symbol not found
Error relocating tbl2asn: __rawmemchr: symbol not found
Error relocating tbl2asn: idn_free: symbol not found
Error relocating tbl2asn: idna_to_ascii_8z: symbol not found
Error relocating tbl2asn: __strndup: symbol not found

I'm wondering if there's some version or platform restriction that the recipe might need, but I've gone as far as I know how to help. Happy to dig more if anyone has a direction for me to go? Would the libidn requirement belong in the tbl2asn recipe, instead of the prokka recipe?

Alternatively, I do see that the unrelated container ummidock/prokka:1.12 does work, so this isn't stopping my work, but I love the idea of biocontainers and want to help if I can. Thanks,

bgruening commented 6 years ago

@kastman thanks for your detective work. Indeed tbl2asn is again to blame here. Because we do not build it and are forced to use the precompiled packages, which are not real static once, you see this error. I will fix this in https://github.com/bioconda/bioconda-recipes/pull/7363

kastman commented 6 years ago

Thanks for checking this again. Is there a difference between libidn (which was a requirement in build 3 of prokka) and libidn1 (which you just added as the requirement in the new build of tbl2asn)? I'm guessing one includes the .so?

Pretty frustrating that ncbi only includes precompiled dynamic binaries, right? Is libidn so universal that it's really architecture neutral like that? Sorry my knowledge of linking is a bit lacking.

I'm also guessing that the updated version is keyed to not expire in 6 months like previous ones, but does that mean that this image will stop working? Thanks for answering these pestering questions, but it will help if I jump in in the future.

Finally, I noticed in the tbl2asn bioconda/bioconda-recipes#7363 PR that it should be rebuilt tonight; is that also the case for the prokka build? Or do we have to wait for the updated tbl2asn to make its way through the nightly process before it's available in the prokka recipe? EDIT - Nevermind, I see you just added a prokka commit too. Thanks!!

bgruening commented 6 years ago

Thanks for checking this again. Is there a difference between libidn (which was a requirement in build 3 of prokka) and libidn1 (which you just added as the requirement in the new build of tbl2asn)? I'm guessing one includes the .so?

Yes, the 11 one includes the so file and is maintained in conda-forge so we will stick to this.

Pretty frustrating that ncbi only includes precompiled dynamic binaries, right?

Oh yes!

Is libidn so universal that it's really architecture neutral like that? Sorry my knowledge of linking is a bit lacking.

Its not that universal but it has so far not broken the ABI.

I'm also guessing that the updated version is keyed to not expire in 6 months like previous ones, but does that mean that this image will stop working? Thanks for answering these pestering questions, but it will help if I jump in in the future.

I think it will :(. I guess you can easily change the time to make it work again?

Finally, I noticed in the tbl2asn bioconda/bioconda-recipes#7363 PR that it should be rebuilt tonight; is that also the case for the prokka build? Or do we have to wait for the updated tbl2asn to make its way through the nightly process before it's available in the prokka recipe? EDIT - Nevermind, I see you just added a prokka commit too. Thanks!!

tbl2asn is already build and prokka comes hopefully soon.

kastman commented 6 years ago

Great - thanks for the quick responses. I'll keep my eyes open!

kastman commented 6 years ago

Verified that build 4 does in fact work in the wild. Thanks again!

bgruening commented 6 years ago

Nice. Thanks for testing!