Open cwhitlock-NOAA opened 5 hours ago
Thanks for organizing this!
The forward-looking approach that does not need the "-t YYYY" clue in order for the workflow to know that the year's YYYY output is ready for postprocessing is even easier than how you had it. The Cylc workflow itself can know whether the history files are available, so strictly speaking it should not need the wrapper to tell it anything about when new history files arrive.
Custom (task) triggering functions is this Cylc feature:
We don't yet use it in fre-workflows, but the pp-shield prototype cylc template does use it. e.g. (https://gitlab.gfdl.noaa.gov/fre2/workflows/pp-shield/-/blob/main/include/shield/shield.cylc?ref_type=heads#L32)
[[xtriggers]]
history_complete = history_complete(model={{ MODEL }}, \
point=%(point)s, \
env_script={{ NGGPS_DIR }}/parm/set_vars.sh, \
state_file_template={{ NGGPS_DIR }}/do_pp/LOGS/NOTE_STATUS.log)
[[graph]]
T00,T06,T12,T18 = """
@history_complete => make-pp-script => run-pp1 => run-pp2<history> => runtrak?
"""
The "is_history_complete" trigger there is run continuously by the Cylc scheduler, and when it passes for a certain cycle point (year in the PP context), the make-pp-script
task will be triggered.
So in an ideal world, we can totally outsource the history file present logic to Cylc. But until we master the cylc triggers, we'll want the ability to have the wrapper tell the workflow to start a particular year (cycle point) of postprocessing.
I'd like to make clear that we don't need the is_history_complete trigger either - the current just-enough-functionality plan for fre pp run triggers a couple possible cylc commands:
The minimal functionality to get postprocessing updated with new data is more like the following:
for (n in 1:way longer than we think we need)
fre pp wrapper -e $experiment -p $platform -t $target
#fre pp wrapper is smart enough to check for new data and tell canopy it is present if data is there (cylc run, reload or trigger)
sleep 4 hours
end
see: https://github.com/NOAA-GFDL/fre-cli/tree/main/fre/pp#readme
After talking with Chris, this functionality seems to come in 3 stages:
Yes, agreed. Cylc external triggers are the ultimate solution to workflows knowing when input data is available, and regular interactive Cylc task control (cylc trigger WORKFLOW//CYCLE-POINT/TASK
) through humans typing it and cylc pp wrapper
using it is perfectly fine for now and we know it well.
Prior versions of fre relied on specifying the last (and sometimes first) years of post-processed data to control how many years of data were processed at once. This is how fre dealt with chunks of history files being copied over from wherever the model was running; the postprocessing syntax looked a bit like this:
where the large pause in between successive calls to bronx-or-earlier's wrapper equivalent gave time for new files to be transferred over to the pp nodes and fre was smart enough to know that prior years were post-processed and only the data in range ($year-1) - $year needed to be processed with this call.
This functionality is not present in the fre-cli codebase, and if we want to maintain backwards-comptaibility on this particular command we'd need to change our command-line options. However, we may NOT need it - canopy is capable of pausing jobs and running again when new data is present. The logic flow for that would look more like this:
Whether or not we implement this is going to depend a lot on whether the users miss this functionality - but for now, it's improvement to remember that this functionality is NOT present in fre-cli.