ACCESS-NRI / payu-condaenv

A conda (mamba) python environment for running payu
Apache License 2.0
0 stars 2 forks source link

Automatically updated dev build #22

Closed aidanheerdegen closed 4 months ago

aidanheerdegen commented 8 months ago

There might be a lag between feature updates of payu and tagging and releasing a new version of this conda environment. For good reasons we might not want to create a large number of versions which might confuse users as to which to use. Also each environment consumes a not insignificant amount of resources:

$ du -shc /g/data/vk83/apps/payu/1.1
625M    /g/data/vk83/apps/payu/1.1
625M    total
$ find /g/data/vk83/apps/payu/1.1 | wc -l
23745

Once deployed it is difficult to remove a payu version in case it is being actively used.

This creates unnecessary work for developers, or users who want to test new features, as they have to install their own local copy of payu to access bleeding edge features.

One solution is to have a payu/dev module which is updated when there are new versions of payu published on the accessnri conda channel.

The most recent version of payu in the channel can be extracted with conda and jq

$ conda search --override-channels -c accessnri --json payu | jq '.payu[-1].version'
"1.1.2"

This would require polling the anaconda channel.

CodeGat commented 8 months ago

See https://github.com/ACCESS-NRI/payu-condaenv/issues/7 regarding polling

aidanheerdegen commented 8 months ago

On reflection maybe we want to make this as bleeding edge as possible. Not waiting for conda publishing (i.e. tagging of versions). Just try and pip install it into a payu/dev environment every night. Could short-circuit by checking for any updates in the last 24 hours and only publishing then, or have it as a push workflow that runs on push to master?

aidanheerdegen commented 8 months ago

Third thought: we should put this in /g/data/vk83/prerelease/apps/ so users don't mistakenly used it.

aidanheerdegen commented 8 months ago

or have it as a push workflow that runs on push to master?

Well that won't work @aidanheerdegen because this isn't the payu repo. Duh.

jo-basevi commented 8 months ago

Yeah I think a push workflow on push to master would require storing some secrets on payu-org, as discussed in #7. So cron job that checks for updates in last 24hrs or that can be triggered manually could be the way to go.

So are you thinking of having an existing conda environment and just having an action that logs in gadi, updates a local clone of payu, activates an existing payu/dev environment, then runs pip install on the payu directory?

I think that might need a micromamba/conda install to create and/or activate an environment. Unless we used an conda-pack environment similar to the current payu environments just without the payu package. Then the module files would be the same in that case.

If we go for only updating on new tags, we can include the payu package and the action could just log in gadi and run conda update payu inside the payu/dev environment (if there's an install of micromamba/conda somewhere). Deploying a whole new payu/dev environment similar to the current payu deployments is also an option.

aidanheerdegen commented 8 months ago

So are you thinking of having an existing conda environment and just having an action that logs in gadi, updates a local clone of payu, activates an existing payu/dev environment, then runs pip install on the payu directory?

Well that is one possibility. It would mean we could have a dev environment that doesn't require pushing tags to the payu-org/payu to build the conda package.

I think that might need a micromamba/conda install to create and/or activate an environment. Unless we used an conda-pack environment similar to the current payu environments just without the payu package. Then the module files would be the same in that case.

We have the option of a static environment (one not generated by a deployment) that we activate and pip install git+git://github.com/payu-org/payu@master, or we could deploy everytime and have the pip install in the env.yaml

https://stackoverflow.com/a/32799944

This is potentially a nice solution, as we could share the rest of the conda env specification with the main deployment to keep them in sync. How we share I'm not quite sure seeing as yaml doesn't support this natively. Possibilities include: using jinja templating and generating the environment files with CI. Sure! Why not!? More CI is always good right?

If we go for only updating on new tags, we can include the payu package and the action could just log in gadi and run conda update payu inside the payu/dev environment (if there's an install of micromamba/conda somewhere). Deploying a whole new payu/dev environment similar to the current payu deployments is also an option.

I like the idea of deploying directly from the master branch on GitHub. It means we can test without tagging a new version. Otherwise we tag just to get it to build a conda package. So we can merge, test, and maybe end up reverting, or doing another PR if we borked something.

It would mean we can merge a fix like https://github.com/payu-org/payu/pull/423 and then ask the user to module load that payu environment and confirm it fixes their issue.

jo-basevi commented 8 months ago

Thanks, I didn't know about pip installing directly from git so that's really good to learn!

Ok, so options are:

  1. Have a static envt, run activate and pip install git+git://github.com/payu-org/payu@master
  2. Use env.yaml for payu/dev deployment but modify it to remove payu and add in the pip install payu from git.
  3. Could use jinja templating for above.
  4. Have a separate env-dev.yml and env-release.yml

Options 2 & 3 would require a rethink on the deploy trigger for the released environments of payu, which runs when any change to the env.yml file is merged to main. Also for example, say there's some new features in payu's git main branch, so payu/dev requires some new dependencies but that doesn't warrant a re-deployment of payu/1.1. The deploy trigger for release envts of payu could be changed to a manual call that is given the payu version as an input.

Having separate files for env-dev.yml and env.yml, could create an issue for files getting out of sync, but it would allow for testing of changes to env-dev.yml.