Description: The cadd-scripts includes some shell scripts and a snakemake pipeline. The pipeline automatically downloads a conda environment when it is executed. There is a bioconda package cadd-scripts, which works like this.
The conda package does not transfer well into container formats because it needs to download the Snakemake pipeline's conda env into \<cadd-scripts's conda-env>/share/cadd-scripts-1.6-1/envs when it is executed, before the actual processing starts (and container filesystems like that are normally read-only or temporary). I propose that we can instead pre-load the conda environment when building the docker image, following a documented recipe in the CADD GitHub repo.
One caveat is that the CADD annotation database is huge and will not be possible to bundle into the container image. So we still need to bind-mount the annotation database.
I would like to request a docker image for cadd-scripts v1.6 - Combined Annotation Dependent Depletion
Website: https://cadd.gs.washington.edu/ The GitHub repo has install instructions for the cadd tools: https://github.com/kircherlab/CADD-scripts
Description: The cadd-scripts includes some shell scripts and a snakemake pipeline. The pipeline automatically downloads a conda environment when it is executed. There is a bioconda package cadd-scripts, which works like this.
The conda package does not transfer well into container formats because it needs to download the Snakemake pipeline's conda env into \<cadd-scripts's conda-env>/share/cadd-scripts-1.6-1/envs when it is executed, before the actual processing starts (and container filesystems like that are normally read-only or temporary). I propose that we can instead pre-load the conda environment when building the docker image, following a documented recipe in the CADD GitHub repo.
One caveat is that the CADD annotation database is huge and will not be possible to bundle into the container image. So we still need to bind-mount the annotation database.
(edited to remove unnecessary text)