cylc / cylc-flow

Cylc: a workflow engine for cycling systems.
https://cylc.github.io
GNU General Public License v3.0
335 stars 94 forks source link

Errant flow merge when triggering #6361

Closed MetRonnie closed 1 month ago

MetRonnie commented 2 months ago

Description

R1 = """
    foo
    bar
"""

If foo fails in flow 1 while bar is in flow 2, triggering foo without providing a flow in the options causes it to run in flows 1 & 2.

This is because the default --flow all is causing it to pick all active flow numbers in the entire task pool instead of all active flow numbers for the task in question.

https://github.com/cylc/cylc-flow/blob/293c3ccba1877adc1061064b410ea54b62beb4ad/cylc/flow/task_pool.py#L2077-L2078

https://github.com/cylc/cylc-flow/blob/293c3ccba1877adc1061064b410ea54b62beb4ad/cylc/flow/task_pool.py#L2025-L2037

Looking at the code, this bug probably affects cylc set too

oliver-sanders commented 2 months ago

If foo fails in flow 1 while bar is in flow 2, triggering foo without providing a flow in the options causes it to run in flows 1 & 2.

Hmm, there's nothing untoward here, this is the intended behaviour, all triggers default to --flow=all.

--flow=all means "all flows in the workflow" not "all flows on the task".

MetRonnie commented 2 months ago

But maybe the default should not be "all flows in the workflow", but "all flows on the task"?

oliver-sanders commented 2 months ago

If the task has already spawned, then the trigger should only apply to the spawned flows (see this line in the set proposal):

--flow=INT: Flow(s) to attribute spawned tasks. Default: all active flows. If a task already spawned, use current flows.

-- https://github.com/cylc/cylc-admin/blob/2fa7e970926935e76758010be1798d6d709d1b47/docs/proposal-cylc-set.md?plain=1#L65-L66

I would expect this logic to extend to trigger.

If it hasn't already spawned then it should get all flows. This default was well debated at the time.

I.E. It depends on the pool state.

MetRonnie commented 2 months ago

Right so this is a bug then. The default should be "all flows on the task", which is meaningless if it hasn't spawned yet, in which case it should be "all flows in the workflow"?

oliver-sanders commented 2 months ago

"all flows on the task", which is meaningless if it hasn't spawned yet

:+1:

So long as we're strictly talking n=0 here. The proposals already define how --flow=all should interact with an n=0 task.

MetRonnie commented 2 months ago

What about setting/triggering a task that already ran and is no longer in n=0? Should it not use the flow number(s) from last time it ran?

I tested this and this is the case for n>1 in the past. However for n=1 in the past it is flow-merging like described for n=0.

oliver-sanders commented 2 months ago

What about setting/triggering a task that already ran and is no longer in n=0? Should it not use the flow number(s) from last time it ran?

I don't think there's any special logic for n<0. A historical task in one flow might be a future task in another flow so the previous run might not have any relevance to the trigger.

hjoliver commented 2 months ago

Just to confirm, the plan was:

@oliver-sanders - it seems like you agree but initially did not realize that @MetRonnie was specifically talking about triggering an n=0 task?

oliver-sanders commented 1 month ago

Closed by #6367