payu-org / payu

A workflow management tool for numerical models on the NCI computing systems
Apache License 2.0
21 stars 27 forks source link

Don't require `cice_in.nml` namelist in CICE5 restarts #507

Closed blimlim closed 2 months ago

blimlim commented 2 months ago

Attempts to close #499

This pull request modifies the cice.py driver, by pulling the timing calculations out from setup() into a separate calc_runtime() method. This method is then overridden in the cice5.py driver, allowing the cice_in.nml namelist file to be absent from the restart directory, since OM2 does not use the information from these calculations. This in turn will allow OM2 to be run after using payu clone -r with a specified OM2 restart directory.

Most of the code changes are in the tests, which require a bit of intricate set up. Given that we're skipping payu setup and directly running model.setup() (in order to avoid more complicated pre-test setup), it's difficult to know how well we've matched the conditions for setting up a real experiment. E.g. When setting up the tests, inadvertent changes to the directories that existed at the start of the test led to payu using different work restart directories during different parts of the setup, and different restart files were moved to different places in a way that was tricky to understand. And so while the tests are working, they do feel a bit precarious. If you have any suggestions that would be very welcome!

The main relevabt tests are the test_restart_clone tests for both CICE 4 and CICE 5. These do the following:

  1. Create a restart directory with fake restart files required by CICE4/5.
  2. Clone an experiment and point to the created restart directory.
  3. Set up CICE within the cloned control directory to see if it works.

The CICE 5 version of test_restart_clone passes with the new cice driver changes, and would have failed previously.

One thing I'm unsure about: Do you think it's necessary to include the cloning steps in the tests. I think the main issue was payu requiring restart files that cice5 didn't use. This came up when trying to clone, but I don't think the issue came from the cloning step itself. Perhaps it's still useful to be testing the whole clone/setup process though.

Any feedback/suggestions would be very welcome! I'd be keen to get some other ideas before trying to finalise this, and so have marked this PR as a draft.

coveralls commented 2 months ago

Coverage Status

coverage: 57.164% (+1.4%) from 55.782% when pulling c39f45ebcf6fc0d4273abe87262ded1d1a657924 on iss499 into bdf0c66f143d64b2ecf06289d5cb93e994822f68 on master.

jo-basevi commented 2 months ago

Thanks for putting in these changes!

One thing I'm unsure about: Do you think it's necessary to include the cloning steps in the tests. I think the main issue was payu requiring restart files that cice5 didn't use. This came up when trying to clone, but I don't think the issue came from the cloning step itself. Perhaps it's still useful to be testing the whole clone/setup process though.

I think cloning wouldn't be necessary in these tests, as when a restart is added through cloning, payu just sets the restart option to config.yaml, e.g.

restart: <fullpath/to/restart>

This can be replicated in tests by configuring restart in the test config.yaml files as part of test setup. I think the restart path for cloning is currently tested for in test_branch.py.

pep8speaks commented 2 months ago

Hello @blimlim! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 77:80: E501 line too long (113 > 79 characters)

Comment last updated at 2024-09-17 05:32:14 UTC
blimlim commented 2 months ago

I think cloning wouldn't be necessary in these tests, as when a restart is added through cloning, payu just sets the restart option to config.yaml

Awesome thank you! I've removed the cloning steps from the tests, so that they now just test setting up the cice models with and without restart paths specified in the config.yaml.

Is it possible to set-up the cice5 tests to import more code from test_cice.py instead of copying it ? There may be enough differences its not worth doing ?

Good question, I think the tests are very similar between the two cice versions. Now that the tests have been updated, I'll test this out in a separate branch to see how sharing tests between the two might look.

blimlim commented 2 months ago

Ok hmm I'm finding it a bit challenging to share the test code between the cice 4 and 5 tests. I've temporarily uploaded a (not quite working) attempt here: 377cd5f

My idea was to build a base test class, which separate TestCice4Setup and TestCice5Setup classes added specific details to. However the two cases were doing a few different things with their assertions, and my attempts at handling this got a bit messy. The bigger problem I'm having though is correctly handling all the fixtures and constants for the two cases. at the moment the tests are failing as they're mixing correct and incorrect ones. I think it should be possible to come up with something better, but it might require a bit of a rework of the pre-test setup.

Let me know if you think the (attempted) changes do look any cleaner though, and if it might be worth trying to get them working.

anton-seaice commented 2 months ago

Happy to take your advice on whether moving to more generalised tests add benefit or confusion :)

blimlim commented 2 months ago

I found it a bit difficult (and in the end wasn't successful) in getting the imported tests to use the correct versions of constants and fixtures, and so I'm thinking for the moment it might be easiest to keep the tests separate.

I've just added a couple of small fixes: Re-adding in the assert (prior_runtime_seconds % setup_nml['dt'] == 0), testing that run time calculations are correct for a few different values (previously I was just testing one), and some smaller cleaning up, and so I think it should be ready for final review :)

blimlim commented 2 months ago

Thanks for the reviews and help everyone!