Closed liamhuber closed 4 months ago
Ok, so conda-incubator/setup-miniconda@v3
has no problem giving me the requested python version, and when I mamba update
with an env file containing python_abi
I get the right version of that too, and ultimately python -c "import sys; print(sys.version)"
is coming out exactly as expected.
My guess is that one of the dependencies is not available at 3.9 (or not 3.12) and python gets updated alongside.
Manually adding the unittest and coveralls-codacy dependencies still gives the correct python and python_abi versions. Time to kick it up a notch at try invoking pyiron/actions/cached-miniforge
instead of the incubator directly.
Ok, the problem occurs when we mamba env update
while the env file explicitly has - python >=3.9,<3.13
; if the python version not mentioned in the env file, it seems that we avoid upgrading the requested python version. Since we want to specify the python bounds in the env file, the remaining solution is to pass the env file directly to the conda incubator, which is working fine even with the python versions present.
What remains is to update cached-mamba
to get the caching functionality playing nicely with the env being created sans update.
The cache needs to know the path for the conda env it's looking for. When I run conda-incubator/setup-miniconda
, this is available under env.CONDA
. Otherwise I can find a path under $CONDA
(/usr/share/miniconda
on ubuntu-latest) but I need to verify if it's the same as I get post-incubator.
Ok, so here we hit the first and only real problem; $CONDA
changes when we run the incubator, so it's not available a-priori. After running the incubator, both $CONDA
and env.CONDA
point to /home/runner/miniconda3
. This is probably because I insist on using Mambaforge. Let's switch off all the flags and see if it finds /usr/share/miniconda
instead.
It does use /usr/share/miniconda
when no flags are specified (although the actual conda executable from which conda
changes a bit).
The minimal change would be if there were some way to know what the conda env path would be a-priori, then simply cache/restore
and (if the restore hit failed, run the incubator then) cache/save
. Of course, I could get brute-force access by running the incubator (at most) twice -- once to get the path and check the cache, and (possibly) a second time to actually build an environment. The trouble there is that the incubator takes ~25s even with no env file, and I don't really want to burn that much time.
The incubator docs themselves for caching envs actually suggest to do exactly what we currently do, so that's no help.
pyiron_base
uses Jan's suggestion of passing the env file directly, but doesn't exploit any caching, so that's not super helpful either.
The cache key doesn't depend on this, so my gut reaction is to break the cache into two pieces: first storing the env path at some fixed, known location, and then second storing the actual env. We can try restoring the location cache and only if the cache gets hit do we proceed to restore the actual cached env using the result of location cache for the path
cache arg. This feels ugly though, so let's have a think if there's a better solution before pushing on.
I couldn't think of anything more elegant, so let's go with the double-cache attack.
Aha, it's not finding the cached conda executable because /home/runner/miniconda3/condabin
doesn't exist.
And of course it doesn't, because I only cache /home/runner/miniconda3/envs
. Let's try additionally caching the executable. Even if this works, I should be sensitive to the cache size, which is 55MB with just the env. Otherwise I guess I can make a raw environment, then reload the env, then activate the env instead.
I should be able to activate a conda env from anywhere, so let's try that with the built-in miniforge installation and see if it works. It would be simpler than caching/reinstalling mambaforge.
Well, that's infuriating. I got the built-in conda executable to activate the setup-miniconda
cached environment, and the config exposing that environment is even persistent to the outer job, but I don't currently have a way of making that env the default loaded env outside of the initial step where we activate it.
GitHub doesn't do anything to manage conda env persistence between steps, much less jobs. The modifications to the conda config seem persistent enough for my purposes, but then one still needs to jump through hoops to activate the correct env in each step. Since the shell profile is not sourced between each step, there is no easy way to automate this (one could pass the buck to the shell profile, but then the user needs to remember to source that each step!)
As far as I understand, conda-incubator/setup-miniconda
gets around this by really modifying PATH
and creating new conda executables, etc., which get referenced and do maintain env persistence.
Sadly, I think the most practical thing to do is give up on the speed of leveraging the built-in conda binary and just always run conda-incubator/setup-miniconda
-- either with an environment file specified (if not caching), or a clean install followed by unpacking the cached env and activating it (when available and requested).
Uncaching, installing a blank miniconda, and then activating the uncached env works fine, but the activated env is not persistent across steps. Uncaching, and installing a miniconda that explicitly activates the existing env as an arg is persistent but the env is somehow completely empty -- not a single package!
Unrelated but tested in parallel, stopping a clean miniconda from activating the base env brings the time down from 27s to 20s (N=1 test), so not mind blowing but worth doing if the opportunity presents itself.
Super, creating an empty env then unpacking the cache into the same location worked perfectly! I get the right packages, it's all persistent to outside the called action, and it's faster than setting up the env from scratch (for this particular env (pyiron_snippets
test env) it's 14s having a blank env and uncaching vs ~35s solving and installing from the env file).
With a better grasp of how this works, I think I can even streamline the process and eliminate the two-stage caching.
Setting up the cache took 45s, exploiting it takes 35s, and running without any caching at all takes 30s. Again, all N=1 tests with the super simple env from pyiron_snippets
.
Changing the env to be a bit more complex (pyiron_base
and dask
with no version pins), these become 1m10s, 27s, and 1m8s.
This is hardly an exhaustive or scientific test, but we reassuringly see that creating complete+caching takes longer than creating, and that creating empty+uncaching is faster than creating complete+caching. Whether creating complete is faster or slower than creating empty+uncaching depends on the complexity and size of the environment. Since the losses with caching are small for simple environments but the gains are significant for more complex environments, my qualitative conclusion is that caching should be the default behaviour.
I want to put this branch through its paces exploiting it in a real test environment (especially across all the host OSs), then if that goes well I'll come back and clean this up and release it.
Taking the new cached-miniforge
for a spin over on pyiron_snippets
(a) the env name was not appearing in the cache label -- an easy fix -- and (b) which conda
fails on windows... I looked into that here and the conda commands (conda list
...) still work as expected, it's just which
that borks. So let's simply ditch the which conda
step of the readout!
Looks OK on snippets, let's clean it up.
Per #123, non-default (3.12) python version unit tests are not being run. Let's fix this.