epruesse / SINA

SINA - Reference based multiple sequence alignment
https://sina.readthedocs.io
GNU General Public License v3.0
40 stars 4 forks source link

sina not working within a docker container #107

Open nick-youngblut opened 1 year ago

nick-youngblut commented 1 year ago

If I run sina within a docker container (I've tried quay.io/biocontainers/sina:1.7.1--h9aa86b4_0 and creating my own container; see below), I get the error:

02:38:59 [SINA] This is SINA 1.7.2.
02:38:59 [SINA] Unable to open ARB database "SILVA_138.1_SSURef_NR99_12_06_20_opt.arb".
02:38:59 [SINA] The ARB database you were trying to use is likely corrupted.

My job:

sina --threads 4 --turn \
    --fasta-write-dna --lowercase none \
    --intype fasta --outtype fasta \
    --in $fasta \
    --db  SILVA_138.1_SSURef_NR99_12_06_20_opt.arb \
    -o align.fna

If I run the same job outside of the container image:

wget https://github.com/epruesse/SINA/releases/download/v1.7.2/sina-1.7.2-linux.tar.gz
tar -pzxvf sina-1.7.2-linux.tar.gz

  ./sina-1.7.2-linux/sina --threads ${task.cpus} --turn \
    --fasta-write-dna --lowercase none \
    --intype fasta --outtype fasta \
    --in $fasta \
    --db $sina_db_arb \
    -o align.fna

...the job completes successfully.

There seems to be something about running sina within a docker container that causes it to fail with the error: [SINA] Unable to open ARB database

Maybe it has something to do with ownership/permissions?

The dockerfile for my own sina image:

FROM ubuntu:22.04

USER root
RUN apt-get update && \
    apt-get install -y wget python3.9 && \
    ln -sf /usr/bin/python3 /usr/bin/python && \
    apt-get clean && \
    apt-get purge && \
    rm -rf /var/lib/apt/lists/* /tmp/*

RUN wget https://github.com/epruesse/SINA/releases/download/v1.7.2/sina-1.7.2-linux.tar.gz && \
  tar -pzxvf sina-1.7.2-linux.tar.gz && \
  mv sina-1.7.2-linux /bin/sina-1.7.2-linux/ && \
  ln -sf /bin/sina-1.7.2-linux/bin/sina /bin/sina && \
  rm -f sina-1.7.2-linux.tar.gz

WORKDIR /data

Switching to ubuntu:20.04 does not fix the issue.

nick-youngblut commented 1 year ago

The exception is thrown at:

    data->gbmain = GB_open(arbfile.c_str(), "rwc");
    if (data->gbmain == nullptr) {
        throw make_exception("Unable to open ARB database {}.", arbfile);
    }

...but I can't tell why. I don't see the code for GB_open() in all of the sina source code

epruesse commented 1 week ago

GB_open is in libARBDB.so - it's the "connect" function for ARB databases. Usually, if that throws an error, it really just couldn't open the file. As in, it's not there because you misspelled the path, it doesn't have the right permissions, or it's not actually an ARB database.