KarchinLab / open-cravat

A modular annotation tool for genomic variants
MIT License
113 stars 27 forks source link

Install modules as part of a conda env #281

Open esebesty opened 1 month ago

esebesty commented 1 month ago

I was wondering if it is somehow possible to do the following (or implement it in the future).

I'm using open-cravat as part of a snakemake pipeline, where each snakemake rule has its own environment definition in a yaml file. Snakemake uses this yaml file to install needed tools and detects if an environment was already set up and reuses it as needed. I created a yaml file for oc, but annotation modules can't be defined in the yaml, so the actual snakemake rule looks like this:

rule oc:

    conda: "envs/opencravat.yaml"

    shell:
        """
        oc module install -y -f -v 1.1.0 aloft
        oc module install -y -f -v 1.1.1 clinvar
        oc run {input.tsv} -l hg19 -a aloft clinvar -t text excel --cleanrun
        """

Is there a better way to do this, so I don't need to run the module install every time I run the rule? I don't want to manually activate an environment, install modules there, and use that. Everything should be reproducible based only on the yaml definition.

Would it be possible to add various annotation modules to conda, so I can install specific annotation versions, define them in the environment and snakemake will take care of the rest.

So something like:

mamba create -n oc
mamba activate oc
mamba install open-cravat 
mamba install oc-module-aloft=1.1.0
mamba env export > oc.yaml

where the final oc.yaml, that I can reuse looks like this:

name: oc
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - open-cravat=2.7.3
  - oc-module-aloft=1.1.0
jasminebro commented 1 month ago

Hi @esebesty this is a very interesting feature request. I will pass thing along to our IT team to investigate if this is a possibility. Thank you for the request. Please let us know if you have any other questions about OpenCRAVAT.