cylc / cylc-flow

Cylc: a workflow engine for cycling systems.
https://cylc.github.io
GNU General Public License v3.0
335 stars 94 forks source link

Decide on the future of naked tasks. #3866

Closed hjoliver closed 3 years ago

hjoliver commented 4 years ago

Should they stay or should they go?

https://github.com/cylc/cylc-admin/pull/111#pullrequestreview-506579884

[edit]

"Naked" tasks are tasks without an explicit runtime definition i.e in this example bar is a "naked" task:

[scheduling]
  [[graph]]
    R1 = """
      foo
      bar
    """

[runtime]
  [[foo]]

Naked tasks should cause configuration errors by default with this behaviour overridden by a new config:

[scheduling]
  allow implicit = True

We have referred to these as dummy or naked-dummy tasks in the past, I think we will go with "implicit" from here on?

[/edit]

dpmatthews commented 4 years ago

I think we need to change terminology. These tasks will run whatever is defined in the root runtime section so they are not necessarily dummy tasks. Maybe "implicit" rather than "naked"?

If we really want to allow naked tasks to only generate a warning by default (not my preference) then I'd suggest we add a cylc.flow setting which turns these warning into errors. We could also do the same for deprecation warnings.

hjoliver commented 4 years ago

I may be coming around to your point of view. To be fair it should be more important to avoid accidental "implicit tasks" caused by a simple typo, than to make quick "experiment with scheduling" flows super-easy to write.

How about disallow them, but provide a command line one-off opt-in capability? cylc play --allow-implicit-tasks

oliver-sanders commented 4 years ago

The case for naked tasks

The majority of the complexity of a workflow lies in the [scheduling] section, this is where the hardest-to-debug errors are made and where support queries regularly lead. This section configures many of the most conceptual parts of a workflow including cycling, recurrences, xtriggers, etc.

As a result I would encourage users to start writing a workflow with the [scheduling] section, then move on to the [runtime] in due course. This is a more incremental approach which avoids debugging cycling issues in a real workflow with long run times and added complexity. The Cylc tutorial embodies this, introducing the [scheduling] section, graph logic and visualisation tools before even mentioning the [runtime] section. This is an approach I've seen users take from time to time when writing new suites. It's also an approach I often use myself for mocking up simple workflow examples to send in answer to user queries.

Naked tasks reduce the barrier to experimentation and rapid-prototyping making learning and workflow development easier, especially for new users as they feel their way around Cylc's scheduling concepts.

The documentation is full of examples which use naked tasks to simplify examples cutting out needless complexity. Here are three illustrative examples chosen at random:

Were we to forbid naked tasks or require use of an option or configuration all of these previously valid examples would fail despite being fundamentally fine.

There are also functional uses for naked tasks, consider the common pattern:

[runtime]
    [[root]]
        script = rose task-run

If a task does not require explicit configuration what is the merit to enforcing it? Requiring users to manually list their tasks could prove a headache in some cases. What if tasks are programatically generated?

The case for protecting against naked tasks

Unintentional naked tasks are typos which change the name of a task in a graph section to something which isn't defined in the runtime section. This does not cover all potential typos but does cover all typos that we can realistically capture. This is an especially important integrity check in production environments where we would expect each task to be individually defined and configured.

hjoliver commented 4 years ago

@oliver-sanders has nicely described exactly how I've always viewed and used naked tasks. So I tend to agree that we shouldn't prevent that if possible.

The trouble with opt-in disallow (via CLI option or config setting) is that production users may not know about it, or may forget to use it.

The trouble with opt-in allow via config setting is that production users may forget to remove it after moving to production mode.

So how about (as I suggested above) we support opt-in use of naked tasks via CLI option (and something equivalent for the GUI):

cylc run --allow-naked-tasks my.flow

This retains safety automatically for production users - who are unlikely to use such an option accidentally! - and it's a very minor imposition on those experimenting with the scheduling section. We can even highlight the existence of the option in the error message when naked tasks are detected, so that casual users don't have to remember it...

oliver-sanders commented 4 years ago

Kinda prefer the idea of just warning for naked tasks (after all they are not errors) and providing an opt-in "turn warnings into errors" option for production environments.

Opt-out might be ok-ish if we document it properly, note that commands like cylc get-config and static visualisation via the UIS would need to run using --allow-naked-tasks to facilitate prototyping.

hjoliver commented 4 years ago

Kinda prefer the idea of just warning for naked tasks (after all they are not errors) and providing an opt-in "turn warnings into errors" option for production environments.

That's OK with me, but I can see @dpmatthews point given that opt-in safety checks are easy to miss or forget about. Let's see what he thinks...

dpmatthews commented 4 years ago

I think opt-in safety checks are easy to miss or forget about.

dpmatthews commented 3 years ago

We need to agree a way forward on this. Personally I still think implicit tasks should be rejected by default for safety (I prefer "implicit" but I don't mind if we stick with referring to them as "naked"). I think requiring a command line option (cylc play --allow-implicit-tasks) is messy (affects multiple commands), easy to forget and will discourage their use which isn't what we want. Therefore, I think the best solution would be a configuration option in flow.cylc, e.g.

[scheduling]
  allow implicit = True

We'd encourage users to set this whilst doing experimentation/ prototyping in the [scheduling] section and then remove it when they move on to setting up [runtime] (and this is the approach we would use in the tutorials).

oliver-sanders commented 3 years ago

Agreed this may be the best option 😠, though s/scheduling/scheduler/.

hjoliver commented 3 years ago

Agreed, good solution :tada:

MetRonnie commented 3 years ago

Do we want to keep the --strict option for cylc validate to disallow implicit tasks for validation (overriding [scheduler]allow implicit tasks = True), or do we want to get rid of --strict?

oliver-sanders commented 3 years ago

We would really like to get rid of --strict.

MetRonnie commented 3 years ago

Looks like --strict is also used to warn if sequences are out of bounds for the initial cycle point, so what would happen to that? Just leave it to cylc.flow.flags.verbose? https://github.com/cylc/cylc-flow/blob/66c11198ebf9ce044d290b2810484c84d08c3ef0/cylc/flow/scripts/validate.py#L104-L120

hjoliver commented 3 years ago

Yeah, verbose will do. I'm not sure why we had that check only for --strict - it seems pretty important

MetRonnie commented 3 years ago

Looking at git blame, #2921 changed it from an error to a warning, so that's probably why it's only for --strict? Should we include the warning all the time? Or there could be an option like --check-circular called --check-sequence-bounds?

oliver-sanders commented 3 years ago

I think always log the warning and get rid of the verbose junk and maybe get rid of the special verbose logging, not sure why that prints out the same message in a different format.

At the time of #2921 --strict was the default for Rose installation so "log this warning all the time" was the expectation for most users.

We are hoping that --check-circular will be the only special validation option going forward, we've only done that because it can take a while to process larger flows.