AndriiPovsten / Snakemake-backend-for-RECAST

Snakemake implementation for RECAST
1 stars 0 forks source link

Add conda environment definition file #5

Closed matthewfeickert closed 1 year ago

matthewfeickert commented 1 year ago

Now that there are some workflow files from PR #4, it would be good to have an environment definition file that gives the software requirements to be able to run the workflow. To do this with conda / mamba / micromamba an environment.yml file is used.

The docs for this are here https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html and it would be good to read through these in general. However, sometimes a minimal example is useful. The following environment.yml file creates a very simple conda virtual environment named env-example that will include Python 3.11, pip, scipy, jax, and jupyterlab and their dependencies.

name: env-example
channels:
  - conda-forge
dependencies:
  - python=3.11
  - pip
  - scipy=1.11  # The use of = means get a version that matches down to the patch level, i.e. `scipy` `v1.11.*`
  - jax==0.4.14 # The use of == means get exactly this version
  - notebook
  - jupyterlab

Note: In general you should assume that for anything science related that the default channel should NOT be used and we should always assume unless told otherwise by a project to use the conda-forge channel.

We can use conda to make a new environment from this file with

conda create --file environment.yml

and then you can activate the environment with

conda activate env-example

if you update the environment.yml file with new dependnecies in the activated environment of the same name you can just do

conda env update --file environment.yml
# micromamba install --file environment.yml  # The command is different for mamba/micromamba

to install those new dependencies into your environment.

If we have this environment file locally we can demo in a Docker image how to use it

$ docker run --rm -ti -v $PWD:/work:ro mambaorg/micromamba:1.4.9-bullseye-slim 
(base) mambauser@b34a3dfcc306:/tmp$ micromamba create --yes --file /work/environment.yml
(base) mambauser@b34a3dfcc306:/tmp$ micromamba activate env-example
(env-example) mambauser@b34a3dfcc306:/tmp$ micromamba list  # See everything in the environment
...
(env-example) mambauser@b34a3dfcc306:/tmp$ python -m pip show jax  # Can still use pip to get information about the environment too
Name: jax
Version: 0.4.14
Summary: Differentiate, compile, and transform Numpy code.
Home-page: https://github.com/google/jax
Author: JAX team
Author-email: jax-dev@google.com
License: Apache-2.0
Location: /opt/conda/envs/env-example/lib/python3.11/site-packages
Requires: ml-dtypes, numpy, opt-einsum, scipy
Required-by: 
(env-example) mambauser@b34a3dfcc306:/tmp$

Note that these sorts of things are things that other IRIS-HEP Fellows that I'm working with are also learning (e.g. https://github.com/Samcoodess/reana-dms/issues/5) so feel free to talk with people like Sam as well, as learning together can be an accelerator. (I'm of course happy to discuss too.)

AndriiPovsten commented 1 year ago

@matthewfeickert could you please check my environment.yml file? I thought it is all packages used for "example hello_world". Or is it better to have a discussion about it tomorrow?

matthewfeickert commented 1 year ago

I thought it is all packages used for "example hello_world"

Yeah, it should contain the high level requirements for everything that is needed to create the environment for a project to run. I left some comments on PR #7 that might be relevant, as getting the workflow to run in the CI environment is a good check that things are setup correctly.

matthewfeickert commented 1 year ago

Closed by PR #6