ucsd-ets / cmm262-notebook

1 stars 2 forks source link

Use `conda-lock` to speed up builds #12

Open aryarm opened 1 year ago

aryarm commented 1 year ago

Our conda environments are re-solved upon each build but, in theory, a re-solve is only necessary when a yml environment file changes.

See also this description of how to use conda-lock with Docker.

aryarm commented 1 year ago

Winter Break Update

The second example in that link above does pretty much exactly what we want. It uses something called multi-stage builds to create the conda environment in a slimmed down linux container so that the built conda environment can be copied into a separate container later on.

Unfortunately, it isn't clear whether each stage in a multi-stage build will be cached separately. Apparently, there is a way to do this, but I tried it and it didn't work. I got an error in the build stage:

buildx failed with: ERROR: failed to solve: specifying multiple cache exports is not supported currently

Luckily I found this comment that seems to explain how to get it to work in Github actions. The key is to create different cache folders for each stage. I don't have time to try this right now - but this should be the next thing to look into.

According to the conda-lock documentation, multi-stage builds also make it easier for us to make our conda setups leaner because we can just do it at the end of the build stage so that the extra layers don't get added to the runtime image. There's a great article about how to do that here.

aryarm commented 1 year ago

another thing I just realized is that deletion of the cache will also "unlock" our locked envs, since our current design only stores the lock files in the cache and not in our github repo

to ensure that this doesn't happen, we could consider exporting the lock file out of the build container using the output: parameter like this

          output: type=local,dest=locked

This way, the files in the root directory of our container should be copied into a locked/ directory within our Github repo after the build

resources