ksahlin / NGSpeciesID

Reference-free clustering and consensus forming of long-read amplicon sequencing
GNU General Public License v3.0
50 stars 16 forks source link

feat: add Dockerfile #18

Closed Coppini closed 2 years ago

Coppini commented 2 years ago

I needed to build a docker image to be able to run this on a server and had to build a Dockerfile for that. Not sure if there's any interest of adding it to your repository, but there it is, in case there's any interest. Might make it easier for people to use it just with Docker as well.

A few notes:

  1. I set medaka and openblas versions to the newest versions as of the time of me creating this PR, instead of using the versions indicated in the installation instructions. I can change it back to the older versions, but as I stated on https://github.com/ksahlin/NGSpeciesID/issues/17, it seems to me that the new versions are working fine.
  2. Instead of installing each dependency, I'm simply copying required binaries and libraries from existing docker images, trying to get the minimal requirements of each.

Example use for file /home/${USER}/NGSpecies_workdir/input.fastq:

docker build -t ngspeciesid:0.1.3 .

docker run \
    -v /home/${USER}/NGSpecies_workdir/:/workdir \
    ngspecies:0.1.3 \
        --abundance_ratio 0.05 \
        --q 10 \
        --ont \
        --fastq /workdir/input.fastq \
        --outfolder /workdir/output \
        --consensus \
        --medaka

This should create the proper outputs at /home/${USER}/NGSpecies_workdir/output

Minor code correction:

ksahlin commented 2 years ago

It can indeed be a bit tricky to install NGSpeciesID due to the dependencies. Having a docker image is great!

There was really no reason besides we verified that the installation worked on our machines using these versions. The openblas was a package required to make medaka work on some machines like Macbook from what I remember (I could be wrong). It is possible that they have removed/relaxed this requirement.

I'm assuming you need root access on the cluster to use the docker image? I'm referring to the /usr/local/bin/ commands. I haven't used docker images, but is it possible to install these binaries in some local directory specified by the user?

As the dockerfile is separate I will go ahead and merge.

Thank you!

Coppini commented 2 years ago

When you use Docker, it creates a containerized machine inside your own, so you have sudo/root access on it either way, as it's not actually your machine's /usr/local/bin, but rather the container's. Usually, you do need root access to install docker on the machine though, to give it enough permission to run properly and spawn "virtual machines" inside your own.