payu-org / payu

A workflow management tool for numerical models on the NCI computing systems
Apache License 2.0
18 stars 25 forks source link

Sweep before changing branches #418

Open aidanheerdegen opened 4 months ago

aidanheerdegen commented 4 months ago

The new branching capabilities are brilliant!

But .. I forget to payu sweep before I payu checkout another branch. And then the next time there is a sweep, either intentionally or via payu run -f, all the logs get copied into the wrong archive directory.

I propose payu does a sweep before changing branches. I think from a design point of view it makes sense too: the user is presented with a clean new experiment control directory.

aidanheerdegen commented 4 months ago

The PBS log files are named solely for the jobname config value or the directory in which they are contained

https://github.com/payu-org/payu/blob/master/payu/schedulers/pbs.py#L60

and the same logic is repeated in sweep

https://github.com/payu-org/payu/blob/master/payu/experiment.py#L907-L908

It would be good if the PBS jobname and therefore the STDOUT and STDERR filenames include the UUID in some way.

One possibility is to make the PBS job name the same as the experiment name. A wrinkle is that PBS job name has a maximum length of 15, which would often truncate the experiment name before the UUID.

For an example, here is an experiment name

1deg_jra55_ryf-no_ncimpi-16fefa3b

One possibility is to retain all the truncated UUID and reduce the rest to fit in the 15 character limit, for the above this would be

1deg_j-16fefa3b

That shows such little info about the experiment that it probably doesn't make much sense.

Another option is to truncate the ID further, say 4 digits, and it would look like

1deg_jra55-16fe
aidanheerdegen commented 4 months ago

Or we could just leave it as is because difficult.

aidanheerdegen commented 4 months ago

Should we be auto-sweeping all but the last PBS log files from the control dir?