Open lminer opened 3 years ago
It is feasible to do so, but native support is unlikely to be added to this action.
My recommendation would be to:
actions/cache
of a conda-pack
archive, hashed off your environment.yml
setup-miniconda
to make an environment from the environment.yml
conda
and conda-pack
in it conda-pack
to make a relocatable archive of the environment before you do anything to it
setup-miniconda
with $CONDA
set to the unpacked envThis approach avoids a number of gotchas with caching conda tarballs, etc.
Do you have any suggestions of an example I might look at for how to do this? My github-actions-fu is quite weak.
Here is our example to cache /usr/share/miniconda/envs
.
Updated
- uses: conda-incubator/setup-miniconda@v2
with:
activate-environment: "xxx"
auto-activate-base: false
use-only-tar-bz2: true # IMPORTANT: This needs to be set for caching to work properly!
- name: Cache conda envs and other stuff
id: conda
uses: actions/cache@v2
env:
# Increase this value to manually reset cache if setup/environment-linux.yml has not changed
CONDA_CACHE_NUMBER: 1
with:
path: |
/usr/share/miniconda/envs/xxx
key: ${{ runner.os }}-conda-${{ env.CONDA_CACHE_NUMBER }}-${{ hashFiles('setup/environment-template.yml', 'setup/*.sh') }}
- name: Run install script
# Only need to run install when deps has been changed
if: steps.conda.outputs.cache-hit != 'true'
run: |
./install # <----- conda packages are installed here via `conda env update -f ...`
@bitphage it's failing for me right now at the last step. I don't have a complicated install process, so I just substituted ./install
with conda env create -f environment.yml
and I got the error:
Could not find conda environment: myenv
You can list all discoverable environments with `conda info --envs`.
This also happens if I do conda env update -f environment.yml
. Any idea what I might be doing incorrectly?
@lminer hmm, make sure that you have activate-environment: myenv
in conda-incubator/setup-miniconda@v2
step.
@bitphage I have. This is what it looks like:
- uses: conda-incubator/setup-miniconda@v2
with:
activate-environment: "myenv"
auto-activate-base: false
use-only-tar-bz2: true # IMPORTANT: This needs to be set for caching to work properly!
# Remove envs directory if exists to prevent cache restore errors. Github runner already has bundled conda.
- name: Remove envs directory
run: rm -rf /usr/share/miniconda/envs
- name: Cache conda envs and other stuff
id: conda
uses: actions/cache@v2
env:
# Increase this value to manually reset cache if setup/environment-linux.yml has not changed
CONDA_CACHE_NUMBER: 1
with:
path: |
~/conda_pkgs_dir
/usr/share/miniconda/envs
key: ${{ runner.os }}-conda-${{ env.CONDA_CACHE_NUMBER }}-${{ hashFiles('environment.yml') }}
- name: Run install script
# Only need to run install when deps has been changed
if: steps.conda.outputs.cache-hit != 'true'
run: |
conda env create -f environment.yml
@lminer ok, I was trying to fix some issues after recent 2.1.0 release of setup-miniconda
action. I've updated the example above. Note that there is no rm -rf
step anymore and caching path should be /usr/share/miniconda/envs/myenv
to avoid cache restore errors.
@bitphage thanks! it's working now. Just shaved 4 minutes off the runtime.
This example is helpful! I noticed it has one more step than the one in the README - is it recommended that everyone add the "Run install script" step? If so, could the example in the README be updated?
My recommendation would be to:
* try to restore a `actions/cache` of a [`conda-pack`](https://conda.github.io/conda-pack/) archive, hashed off your `environment.yml` * if that hits an empty cache * use `setup-miniconda` to make an environment from the `environment.yml` * make sure it has `conda` and `conda-pack` in it * use `conda-pack` to make a relocatable archive of the environment _before_ you do anything to it * like install your system-under-test * if it succeeds * unpack the conda-pack * use `setup-miniconda` with `$CONDA` set to the unpacked env * be fast
This approach avoids a number of gotchas with caching conda tarballs, etc.
I tried this approach and this is the gist of what I ended up with so far, working:
name: Conda Environment Caching Example
on: workflow_dispatch
env:
# Increase this value to reset cache if environment.yml has not changed.
PY_CACHE_NUMBER: 0
PY_ENV: my_env
jobs:
setup-python:
name: Setup Python Environment
runs-on: ubuntu-latest
defaults:
run:
shell: bash -l {0}
steps:
- name: Git checkout
uses: actions/checkout@v2
- name: Cache Python environment
id: cache-python
uses: actions/cache@v2
with:
path: "${{ env.PY_ENV }}.tar.gz"
key:
${{ runner.os }}-${{ env.PY_CACHE_NUMBER }}-${{ hashFiles('**/environment.yml') }}
- name: Install Python dependencies
if: steps.cache-python.outputs.cache-hit != 'true'
uses: conda-incubator/setup-miniconda@v2
with:
miniforge-variant: Mambaforge
use-mamba: true
auto-update-conda: false
activate-environment: ${{ env.PY_ENV }}
environment-file: environment.yml
auto-activate-base: false
- name: Pack Python environment
if: steps.cache-python.outputs.cache-hit != 'true'
run: |
conda pack --force -n ${{ env.PY_ENV }}
use-cached-python:
name: Use cached Python
needs: [setup-python]
runs-on: ubuntu-latest
defaults:
run:
shell: bash -l {0}
steps:
- name: Git checkout
uses: actions/checkout@v2
- name: Get Python cache
id: python-cache
uses: actions/cache@v2
with:
path: "${{ env.PY_ENV }}.tar.gz"
key:
${{ runner.os }}-${{ env.PY_CACHE_NUMBER }}-${{ hashFiles('**/environment.yml') }}
- name: Unpack Python environment
run: |
mkdir -p "${{ env.PY_ENV }}"
tar -xzf "${{ env.PY_ENV }}.tar.gz" -C "${{ env.PY_ENV }}"
source "${{ env.PY_ENV }}/bin/activate"
conda-unpack
- name: Run Python
run: |
source "${{ env.PY_ENV }}/bin/activate"
python -c 'import sys; print(sys.version_info[:])'
In my setup I use different jobs using the same Python environment, that's why I separated the setup from the execution. Using conda-pack you'll have to use the same OS in each job that uses the cache. In my environment.yml I have added conda and conda-pack, and in channels I only have conda-forge.
In the second job I first tried using setup-miniconda after unpacking with:
- name: Activate Python environment
uses: conda-incubator/setup-miniconda@v2
env:
CONDA: my_env
with:
activate-environment: ${{ env.PY_ENV }}
auto-activate-base: false
but that didn't gave the result I expected. Instead it created a new environment in my_env/envs/my_env. I was looking to not having to source "${{ env.PY_ENV }}/bin/activate"
in each step after unpacking.
- use
setup-miniconda
with$CONDA
set to the unpacked env
I am also trying to setup a GA with setup-miniconda
and conda-pack
but I don't get that part. Does someone have a quick snippet example?
Hi all. I was interested in this as well, and I ended up with this Github Actions workflow based in part on @OlafHaag's, which is run in ubuntu, macOS and Windows. I share it here in case it's useful to others.
name: tests
on:
push:
pull_request:
types: [opened, reopened]
env:
# Increase this value to reset cache if environment.yml has not changed.
PY_CACHE_NUMBER: 2
PY_ENV: cm_gene_expr
jobs:
pytest:
name: Python tests
runs-on: ${{ matrix.os }}
strategy:
max-parallel: 4
fail-fast: false
matrix:
python-version: [3.9]
os: [ubuntu-latest, macOS-latest, windows-latest]
steps:
- name: Checkout git repo
uses: actions/checkout@v2
with:
lfs: false
- name: Cache conda
id: cache
uses: actions/cache@v2
with:
path: "${{ env.PY_ENV }}.tar.gz"
key: ${{ runner.os }}-${{ env.PY_CACHE_NUMBER }}-${{ hashFiles('environment/environment.yml') }}
- name: Setup Miniconda
if: steps.cache.outputs.cache-hit != 'true'
uses: conda-incubator/setup-miniconda@v2
with:
miniconda-version: "latest"
auto-update-conda: true
activate-environment: ${{ env.PY_ENV }}
channel-priority: strict
environment-file: environment/environment.yml
auto-activate-base: false
- name: Conda-Pack
if: steps.cache.outputs.cache-hit != 'true'
shell: bash -l {0}
run: |
conda install --yes -c conda-forge conda-pack coverage
conda pack -f -n ${{ env.PY_ENV }} -o "${{ env.PY_ENV }}.tar.gz"
- name: Unpack environment
shell: bash -l {0}
run: |
mkdir -p "${{ env.PY_ENV }}"
tar -xzf "${{ env.PY_ENV }}.tar.gz" -C "${{ env.PY_ENV }}"
- name: Setup data and run pytest (Windows systems)
if: runner.os == 'Windows'
env:
PYTHONPATH: libs/
run: |
${{ env.PY_ENV }}/python environment/scripts/setup_data.py --mode testing
${{ env.PY_ENV }}/python -m pytest -v -rs tests
- name: Setup data and run pytest (non-Windows systems)
if: runner.os != 'Windows'
shell: bash
env:
PYTHONPATH: libs/
run: |
source ${{ env.PY_ENV }}/bin/activate
conda-unpack
python environment/scripts/setup_data.py --mode testing
if [ "$RUNNER_OS" == "Linux" ]; then
coverage run --source=libs/ -m pytest -v -rs tests
coverage xml -o coverage.xml
else
pytest -v -rs tests
fi
- name: Codecov upload
if: runner.os == 'Linux'
uses: codecov/codecov-action@v2
with:
files: ./coverage.xml
name: codecov-${{ matrix.os }}-python${{ matrix.python-version }}
fail_ci_if_error: true
verbose: true
It would be great if it were possible to cache a conda environment. I see from here that it is possible for a vanilla python environment.