payu-org / payu

A workflow management tool for numerical models on the NCI computing systems
Apache License 2.0
19 stars 26 forks source link

Reinvigorate remote_archive to support automatic backup of data from ephemeral scratch storage #200

Closed aidanheerdegen closed 12 months ago

aidanheerdegen commented 5 years ago

NCI is upgrading its HPC and at the same time changing replacing the short filesystem with scratch, which is time limited.

payu is so well written (props @marshallward) that there is only one mention of the actual path /short in the entire codebase

https://github.com/payu-org/payu/blob/f567db9dad9fdd219ea632b76d7d715e8e65e457/payu/laboratory.py#L57

(we'll ignore hard coded paths in profiler modules)

So at the very minimum to support the new machine default_short_path should be changed to scratch.

However, as scratch is time limited it is no longer a good fit for the current payu pattern, where bin, input and codebase are stored in the same laboratory as work and archive.

https://github.com/payu-org/payu/blob/f567db9dad9fdd219ea632b76d7d715e8e65e457/payu/laboratory.py#L45-L49

With strict time limited deletion of files on scratch, the only directory that is a clear fit for this pattern is work. The archive directory could live on scratch, with some syncing to a permanent data store, but I think payu should also support archive not being physically co-located with `work.

Thoughts?

rmholmes commented 5 years ago

This sounds good to me. work would be on scratch. archive could be specified either on scratch (while setting up auto-syncing to a permanent data store for archive) or with the rest presumably at /g/data/--project--/--user--/? Sounds like you just need a new default_gdata_path to set this second directory.

If someone uses scratch for their archive, and doesn't set up an auto-sync, can we get a warning setup about the time limit (or will NCI be providing some kind of warning for when files will be deleted)?

marshallward commented 5 years ago

There is a function in Experiment named remote_archive which was once used to transfer model output over to MDSS, back when storage limits were so severe that model runs would often go over quota.

As storage became less of a problem, this function became deprecated and is now basically a zombie function which cannot be called. But it would not be much work to reinstate something like this.

Which is to say that we've been here before, albeit under different constraints (space vs time), but it should not be a problem to either append this function to the end of archive() or integrate into the payu archive command.

aidanheerdegen commented 1 year ago

Ok. I think this is well overdue to be implemented.

aidanheerdegen commented 1 year ago

The COSIMA issue linked above refers to adding syncing of restarts to their existing sync script

https://github.com/COSIMA/01deg_jra55_iaf/blob/master/sync_data.sh

This script is called with the postscript hook in the payu

https://github.com/COSIMA/01deg_jra55_iaf/blob/master/config.yaml#L77

Not everything in that script is appropriate for including in payu, but it gives a good overview of what syncing capability is required.

Also important to note the rsync options that are used. It isn't feasible to use -a with rsync, as it automatically changes the group (project in NCI speak) to that of origin, but in general /g/data directories have the setgid bit set, so that folders and files copied there have the same project code as the enclosing folder. It is important to keep this behaviour as files and folder accounting is (mostly) done by the group (project) of the file/folder.

Using -a with rsync undoes this behaviour.

aidanheerdegen commented 1 year ago

I've created issue #358 and cross-referenced here. Syncing restarts that will late be deleted is an issue.

It could be that the logic for restarts is different than for outputs: could use rsync machinery to delete restarts at the destination once they're deleted/pruned at the source. This is problematic though if restarts are deleted by time based purging of scratch, and then later deleted.

Another possibility is to have an option to not sync restarts, then tidy them in a separate step and then turn on restart syncing and then call something like

payu sync

If payu did support date based restart frequencies then it potentially has the information it needs to make sure it doesn't sync restarts that will later be pruned.

It would probably require inspection of restart files (so retaining a dependency on the netCDF4 python module, or adding a dependency on xarray, and might not work for all models, so might need to be driver dependent.

jo-basevi commented 1 year ago

Just documenting some notes here:

In the COSIMA issue above, it was also referenced that payu doesn't automatically collate the most recent restart. If rsync was set to exclude uncollated files, then the most recent restart wouldn't be synced. So payu collate -d archive/restart<num>. may need to be another step before syncing restarts.

One payu sync command could potentially collate the latest restart if required, then run a user-script before any syncing (to tidy up any restarts) and then finally rsync the restarts.

Otherwise, auto syncing restarts could be setup to only sync restarts using the integer restart frequency (or using date-based restart frequency).

To automatically sync outputs, could run the sync command where the postscript hook is run, which is at the end of archive if not collating otherwise after collation.

The sync config could something look like:


sync:
        - enable: default false
    # PBS specific:
        - queue: default copyq
        - walltime: (e.g 10:00:00)
        - mem: default 2GB
        - ncpus: default 1

    # rsync specific:
        - directory: destination dir to copy data to
        - rsync_flags: string of any additional flags for rsync

        # For exclusions, could add string to rsync_flag (e.g "--exclude *.nc.* --exclude iceh.????-??-??.nc --exclude *-DEPRECATED --exclude *-DELETE --exclude *-IN-PROGRESS")
        # or have a parameter with a list of strings, e.g.:
        - exclude:
            - *.nc.*
            - iceh.????-??-??.nc
            - etc

        # For restarts
        - restarts:
            enable: default false
            collate_latest: default false # collate latest restart prior to rsync

userscripts:
    # User defined scripts/commands to be called before remote syncing model archive 
    # e.g. in access-om2: a script to concatenate cice daily files and deleting cice log files with only 105-char header
    sync: tidy_archive.sh ```