cbg-ethz / V-pipe

V-pipe is a pipeline designed for analysing NGS data of short viral genomes
https://cbg-ethz.github.io/V-pipe/
Apache License 2.0
132 stars 46 forks source link

Provide Dockerfile & prebuilt/automatically built container image #21

Open Masterxilo opened 5 years ago

Masterxilo commented 5 years ago

This software should be available in an automatically build container image with all dependencies already installed and ready to run with a well defined interface.

The container image should be available via quay or docker hub.

This was done manually at some point but the automatic pipeline has not been established yet: https://quay.io/repository/dryak/v-pipe?tab=tags

Masterxilo commented 5 years ago

Or the sofware could be made available like bwa on anaconda cloud and bioshadock:

https://anaconda.org/bioconda/bwa https://docker-ui.genouest.org/app/#/container/bioconda/bwa

Masterxilo commented 5 years ago

more modern: https://biocontainers.pro/#/

sposadac commented 5 years ago

In general it is a good idea. However, we support distribution with rule-specific conda environments, which we find to be suitable for HPC. A containerised solution could help users to get started, but for most applications, it might fall short (e.g. sponsoring containers within containers is not trivial). For cloud computing, it could prove useful, but such solution would depend on the particular cloud provider.

DrYak commented 5 years ago

Hi Paul, hope you're going well.

If you plan to use V-pipe further in your work, we could flesh out better the first draft solution that you mention we did a long ago. We should discuss your exact needs so we get a good idea how to best serve them.

(Giant "katamari"-style docker wit everything? (i.e.: with the environment of every last single snakemake rule included) Specific dockers for different uses? (Different docker, each with only a select collection of rules' environment specific for specific use-cases ?), etc.)

Regarding a "docker-in-docker" possibility, I would also second Susana's opinion: while doable, it's not a very stable and solid construction.

Regarding bioconda/biocontainers: currently every single component we use has containers automatically produced out of the conda recipes (e.g.: ShoRAH ). But as Susana mentioned, currently V-pipe is designed to use the "per-rule environment" mode of snakemake, so these aren't directly usable.

The "pre-install rules' environment inside a single Docker" approach we used in the draft seems to me to be the most sensible.

Masterxilo commented 5 years ago

Hey Ivan, nice to hear from you.

I think the "all dependencies in one container, one-size-fits all" solution is the best as you say. There should just be an automatic pipeline somewhere which automatically builds the latest version of this container from this repository.

Dockerhub does it for free and circleci and travisci and other automatic build tools are also free on public github projects like this, so it should be doable with no extra cost.

I will use the existing old container until then, but this would be a nice and very useful improvement to this pipeline.

DrYak commented 4 years ago

Not quite Dockerfile yet, but the new SARS-CoV-2 version comes with an installer, that can:

Basically all the pieces are there to make the "sars-cov2" branch into a ready-to-use Docker. Next step I am attacking on: CI/CD. That should help use automate the Docker generation, at least for branches that have a clear set of default config (e.g.: for SARS-CoV-2, that is bwa aligner, shorah caller).

DrYak commented 3 years ago

Hi! Long time no see...

As of master commit 2497d0aa4b23e95e3a4b5b65cbc38a0917da6e6d , the support for docker is finally here.

Hope that this helps you!

Masterxilo commented 3 years ago

Nice @DrYak great job. We will look into it, we are still interested in setting up vpipe as one of our ngs processing pipelines.