Open poquirion opened 3 years ago
I do it for the version we run at the Genome Center by adding that line at the end of the Dockerfile:
RUN bash -ic '/app/scripts/build_db.py'
and removing rules build_snpeff_db
and download_db_files
from the workflow/rules/annotation.smk
files.
The snakemake
file does have a dependency on CONDA_PREFIX
for the database as mentioned. The goal was to simplify the process and was setup with conda in mind.
(1) Yes, if there is no internet access and the snpeff db has not been download before hand it will fail.
(2) This would be correct assuming the container was not a copy that already had the snpeff db downloaded.
(3) Does the other user have access to the conda
environment or are they completely isolated. If they are isolated, they will need to perform the download independently.
It seems like you installed snpeff outside conda
. Is this correct?
Question, If the db is downloaded, will the steps be automatically skipped? If yes, then I will just install the db the container in the deployment script. This will make out life on the CC system easier.
Then for your question, it is installed in the conda environment since the RUN bash -ic '/app/scripts/build_db.py
will be the last line in the dockerfile and the bash -ic ''
force the conda env to be loaded.
Right now snpeff download its database at run time. And on top of that it installs it in the
CONDA_PREFIX
folder There are at least three cases where this will crash the pipeline.1- When the pipeline is ran on a system with no internet access. 2- When the pipeline is ran in a (read only) container 3- When conda is installed as one user and the pipeline in ran as another user.