cylc / cylc-flow

Cylc: a workflow engine for cycling systems.
https://cylc.github.io
GNU General Public License v3.0
335 stars 94 forks source link

Handling future final-incomplete tasks #6383

Open hjoliver opened 2 months ago

hjoliver commented 2 months ago

This creates three final-incomplete tasks (i.e.,with final status and incomplete outputs) ahead of the flow:

[scheduler]
    allow implicit tasks = True
[scheduling]
    [[graph]]
        R1 = "foo => bar =>  f1 & f2 & f3"
[runtime]
    [[foo]]
        script = """
            cylc trigger //1/f1  # with --wait if you like
            cylc set -o failed //1/f2
            cylc set -o expired //1/f3
        """
    [[f1]]
        script = false

On running this,

Final-incomplete tasks are (supposed to be) retained in n=0 as a safety net, for visibility and to eventually stall the workflow unless or until manually completed or removed. However, the stall itself just means the user has not dealt with the problem by the time the scheduler has run out of other things to do. Once the problem is apparent, the user should be able to respond appropriately in the moment, to prevent the future stall.

To "fix" a final-incomplete task we can (+):

problems

  1. final-incomplete tasks created by manual output setting are "hidden" in the DB until the flow arrives at some future time
    • a serious problem already exists, but it is not very visible, which is not conducive to fixing the problem in the moment
  2. removal does not have the desired result on these tasks, because they have not yet blocked the flow
    • when the flow arrives it will be as if the manual set had never been done, and the task will run again
    • so "fix" option (c) does not work as intended for final-status tasks ahead of the flow

solutions

  1. final-incomplete tasks created by manual output setting should be spawned into n=0, for visibility
    • (I think it was our intention that all final-incomplete tasks would be held in n=0)
    • (I haven't thought of any downside to doing this)
  2. cylc remove needs an option to "remember" the removal rather than erasing the history
    • so that removing an n=0 task ahead of time has the same affect as doing it after the flow arrives
    • (aside: we also need the same thing internally, e.g. for removal by suicide trigger, to avoid respawning a removed task)
hjoliver commented 2 months ago

I'm reasonably sure that this is just a small code change, and it is a NIWA priority - in particular to alleviate the pain of our manual expire use cases - which currently entails an unnecessary wait for the future stall to happen - however, as shown above this is more general than that use case. Summary:

oliver-sanders commented 2 months ago
  1. final-incomplete tasks created by manual output setting should be spawned into n=0, for visibility

    (I think it was our intention that all final-incomplete tasks would be held in n=0) (I haven't thought of any downside to doing this)

This is correct.