Open asewnath opened 3 weeks ago
I like the test idea and having a maintained static folder on advda
. To respond to your question on the draft PR, I think if use_pinned_existing
fails it should warn the user and prompt them to use pinned_create
and not automatically start building as that would be confusing for new users.
I have a way of testing adding new CI workflows to this repo, I will share with you.
I would like to keep 2-3 of the latest builds just as a fail-safe, they could be named pinned_jedi_bundle_{date}
after they are done being the main builds. Tier 2 would still run the JEDI develop nightly (separate concern).
The upside of building automatically if use_pinned_existing
fails (for instance, if the maintained JEDI bundle in advda
isn't updated for whatever reason) is that the user then has a local pinned JEDI bundle that they can build once and then link to for future experiments.
If we make use_pinned_existing
the default jedi_build_method
, then users won't have to specify a directory where their own pinned JEDI bundle build is should the one in advda
fail. Additionally, Swell would automatically update it to the correct hash if that local build already exists but needs to be updated. Using pinned_create
, like create
, forces the experiment to build JEDI in the experiment directory every time it runs.
The upside of building automatically if
use_pinned_existing
fails (for instance, if the maintained JEDI bundle inadvda
isn't updated for whatever reason) is that the user then has a local pinned JEDI bundle that they can build once and then link to for future experiments.
I see your point. I'm trying to imagine different use cases (and maybe even confusing myself) to figure out when a user would like to build JEDI on their own:
User X, JEDI contributors (only a handful of users), will use jedi_bundle
to build and work on their own repo branches.
User A, needs a particular JEDI build right after a UFO PR is merged, say September 14th. They will use jedi_bundle
with pinned option to build their own JEDI version, and then set their Swell experiment.yaml
source
and build
to this folder to test first with local build. After they are content, they will test with the advda
build before they can make a Swell PR (check_hashes.py
controls this now).
Do we expect User A to first use jedi_bundle
to get hashes and then copy them to Swell pinned_versions.yaml
?
User B, wants to test/edit some observation filters, they clone Swell right before a pinned_version.yaml
change, when they want to do a PR check_hashes
catches this, throws an error. So the person now can test this with the new pinned JEDI build. If issues arise they become User A.
User C (long term, needs further pondering), wants to run a particular suite with particular JEDI/GEOS version(s). For instance let's imagine we have a GEOS-ADAS suite. We would have pinned builds specified for those suites that gets updated infrequently (once every couple of months).
Perhaps you have a use case in mind where use_pinned_existing
should start a new JEDI build?
Additionally, Swell would automatically update it to the correct hash if that local build already exists but needs to be updated. Using pinned_create, like create, forces the experiment to build JEDI in the experiment directory every time it runs.
Even for this case, I always think a clean build in a new folder is better. Wei and Jianjun had issues while trying to rebuild in the same build folder.
I'm going to pause creating a github action to check hashes for now. @Dooruk using your instructions, I tried adding a test to test_swell.yml
and tried running swell's Test_CI_Application
action. This fails immediately because of the error "Tier 2 is already running". I'm not able to test using Tier1 test yamls unless I have a PR to the main branch in CI-workflows
.
I'm going to pause creating a github action to check hashes for now. @Dooruk using your instructions, I tried adding a test to
test_swell.yml
and tried running swell'sTest_CI_Application
action. This fails immediately because of the error "Tier 2 is already running". I'm not able to test using Tier1 test yamls unless I have a PR to the main branch inCI-workflows
.
Yeah, Tier2 is hitting the __running__
switch file issue but I'm not sure about the Tier1 yaml change issue you are encountering? YOu mean your open PR should be merged first?
I mean that it seems that Tier1 runner doesn't work for any other branch besides main https://github.com/GEOS-ESM/swell/blob/597bbbe9c87867178a130a10c6d07418d1a212d8/.github/workflows/tier1_application_discover.yml#L15. If I tried to point to a different branch, nothing would run. Maybe this is an issue on my end.
If I can only run Tier1 tests on the main branch, then I'd have to continually PR to the main branch to get the check hashes test working, which doesn't sound like a good idea.
@jardizzo @ashiklom @Dooruk
I have a pinned version JEDI bundle in
advda
here:/discover/nobackup/projects/gmao/advda/pinned_jedi_bundle
. This is the JEDI bundle that Swell points to if a user wants to use an existing pinned JEDI build (https://github.com/GEOS-ESM/swell/pull/433). The JEDI repositories in this bundle are currently pinned for August 31 as @Dooruk recommends.I would like to make a test that uses Swell's
check_hashes
tool to check whether the hashes in/discover/nobackup/projects/gmao/advda/pinned_jedi_bundle
correspond to the hashes tracked in Swell (https://github.com/GEOS-ESM/swell/blob/feature/pinned_versions_support/src/swell/utilities/pinned_versions/pinned_versions.yaml). This way, we can be sure that these hashes match during Swell PRs.Let me know your thoughts on this.