cylc / cylc-flow

Cylc: a workflow engine for cycling systems.
https://cylc.github.io
GNU General Public License v3.0
329 stars 93 forks source link

Triggering parentless task in new flow caused it to spawn in later cycles #6258

Open MetRonnie opened 2 months ago

MetRonnie commented 2 months ago

Reproducible Example

[scheduler]
    allow implicit tasks = True
[scheduling]
    cycling mode = integer
    [[graph]]
        P1 = """
            end[-P1] => start
            start & compile => end
        """

[runtime]
    [[end]]
        script = false
cylc trigger example//1/compile --flow=new

causes 2/compile, 3/compile etc to spawn again.

image

Expected Behaviour

Only 1/compile should spawn.

Additional Context

Tested and reproduced all the way back in 8.2.6

hjoliver commented 2 months ago

Not a bug!

In any flow, including the original flow and "new" ones, parentless tasks continue on to future cycles by magically spawning next instances, since they have no parents to do it "on demand". If not for this you couldn't trigger a new flow that traverses the whole graph just like the original flow.

We probably do need an option to not do that, which would make this more of a feature request.

We have definitely discussed this before somewhere but I can't find the associated issue.

~At the moment you can prevent the flow on with cylc set --out=expired --flow=2 on the next-cycle parentless task before triggering flow 2 (which means "don't run the task in flow 2 even if it becomes ready"). Then later you'll have to remove the manually-expired task in flow 2~ - see #6221

Correction: for parentless tasks, this only prevents the one future instance (the manually expired one) from running. That's by design. If a parentless task expires or fails (say) that doesn't mean other instances beyond that cycle should not run.

So for the moment, you have to trigger with --flow=none to avoid flow-on to future cycles. Which obviously is no good if you do want flow-on within the same cycle.

oliver-sanders commented 2 weeks ago

I'm not sure what the question is here, what were you trying to achieve with this trigger?

IMO, the observed behaviour is correct for --flow=new. If we didn't spawn the tasks in future cycles, then the workflow would stall as a result of the trigger which would be highly undesirable.

Suggest ensuring that the intended use case is adequately covered by proposed interventions and closing this if it is?

MetRonnie commented 2 weeks ago

The use case I can think of is "I want to re-run a chain in a particular cycle point, it will catch up with the blocked flow 1 and flow-merge, therefore it is unexpected that the task I triggered starts running in other cycle points"

oliver-sanders commented 2 weeks ago

The best match for that would be group trigger.

As I understand it (I haven't had the time to go through in detail yet):

cylc trigger workflow//1  # remove all tasks (and their outputs) and re-run from the start