payu-org / payu

A workflow management tool for numerical models on the NCI computing systems
Apache License 2.0
18 stars 25 forks source link

Support date based restart_freq #358

Closed aidanheerdegen closed 8 months ago

aidanheerdegen commented 10 months ago

Currently payu supports pruning of restarts based on an integer with the restart_freq configuration option.

From the docs:

restart_freq (Default: 5) Specifies the rate of saved restart files. For the default rate of 5, we keep the restart files for every fifth run (restart004, restart009, restart014, etc.).

Intermediate restarts are not deleted until a permanently archived restart has been produced. For example, if we have just completed run 11, then we keep restart004, restart009, restart010, and restart011. Restarts 10 through 13 are not deleted until restart014 has been saved.

restart_freq: 1 saves all restart files.

The proposal is to support date based options for restart_freq. One option would be to support a subset of pandas offset frequency aliases. e.g.

restart_freq: 1YS

to prune restarts but retain retsrtas at the start of each year.

restart_freq: 50Y

to prune restarts but retain restarts for every 50 years from the date of the beginning of the experiment.

restart_freq: 6M

to prune restarts but retain restarts every 6 months.

This is potentially very helpful for syncing (#200) as it potentially allows payu to know what restarts to sync and what restarts will be deleted, and so not to sync.

Currently date based pruning of restarts for COSIMA experiments is done with this script (or a similar version)

https://github.com/COSIMA/1deg_jra55_ryf/blob/master/tidy_restarts.py