Closed hjoliver closed 3 years ago
I think we need to change terminology.
These tasks will run whatever is defined in the root
runtime section so they are not necessarily dummy tasks.
Maybe "implicit" rather than "naked"?
If we really want to allow naked tasks to only generate a warning by default (not my preference) then I'd suggest we add a cylc.flow
setting which turns these warning into errors. We could also do the same for deprecation warnings.
I may be coming around to your point of view. To be fair it should be more important to avoid accidental "implicit tasks" caused by a simple typo, than to make quick "experiment with scheduling" flows super-easy to write.
How about disallow them, but provide a command line one-off opt-in capability? cylc play --allow-implicit-tasks
The majority of the complexity of a workflow lies in the [scheduling]
section, this is where the hardest-to-debug errors are made and where support queries regularly lead. This section configures many of the most conceptual parts of a workflow including cycling, recurrences, xtriggers, etc.
As a result I would encourage users to start writing a workflow with the [scheduling]
section, then move on to the [runtime]
in due course. This is a more incremental approach which avoids debugging cycling issues in a real workflow with long run times and added complexity. The Cylc tutorial embodies this, introducing the [scheduling]
section, graph logic and visualisation tools before even mentioning the [runtime]
section. This is an approach I've seen users take from time to time when writing new suites. It's also an approach I often use myself for mocking up simple workflow examples to send in answer to user queries.
Naked tasks reduce the barrier to experimentation and rapid-prototyping making learning and workflow development easier, especially for new users as they feel their way around Cylc's scheduling concepts.
The documentation is full of examples which use naked tasks to simplify examples cutting out needless complexity. Here are three illustrative examples chosen at random:
Were we to forbid naked tasks or require use of an option or configuration all of these previously valid examples would fail despite being fundamentally fine.
There are also functional uses for naked tasks, consider the common pattern:
[runtime]
[[root]]
script = rose task-run
If a task does not require explicit configuration what is the merit to enforcing it? Requiring users to manually list their tasks could prove a headache in some cases. What if tasks are programatically generated?
Unintentional naked tasks are typos which change the name of a task in a graph section to something which isn't defined in the runtime section. This does not cover all potential typos but does cover all typos that we can realistically capture. This is an especially important integrity check in production environments where we would expect each task to be individually defined and configured.
@oliver-sanders has nicely described exactly how I've always viewed and used naked tasks. So I tend to agree that we shouldn't prevent that if possible.
The trouble with opt-in disallow (via CLI option or config setting) is that production users may not know about it, or may forget to use it.
The trouble with opt-in allow via config setting is that production users may forget to remove it after moving to production mode.
So how about (as I suggested above) we support opt-in use of naked tasks via CLI option (and something equivalent for the GUI):
cylc run --allow-naked-tasks my.flow
This retains safety automatically for production users - who are unlikely to use such an option accidentally! - and it's a very minor imposition on those experimenting with the scheduling section. We can even highlight the existence of the option in the error message when naked tasks are detected, so that casual users don't have to remember it...
Kinda prefer the idea of just warning for naked tasks (after all they are not errors) and providing an opt-in "turn warnings into errors" option for production environments.
Opt-out might be ok-ish if we document it properly, note that commands like cylc get-config
and static visualisation via the UIS would need to run using --allow-naked-tasks
to facilitate prototyping.
Kinda prefer the idea of just warning for naked tasks (after all they are not errors) and providing an opt-in "turn warnings into errors" option for production environments.
That's OK with me, but I can see @dpmatthews point given that opt-in safety checks are easy to miss or forget about. Let's see what he thinks...
I think opt-in safety checks are easy to miss or forget about.
We need to agree a way forward on this.
Personally I still think implicit tasks should be rejected by default for safety
(I prefer "implicit" but I don't mind if we stick with referring to them as "naked").
I think requiring a command line option (cylc play --allow-implicit-tasks
) is messy (affects multiple commands), easy to forget and will discourage their use which isn't what we want.
Therefore, I think the best solution would be a configuration option in flow.cylc, e.g.
[scheduling]
allow implicit = True
We'd encourage users to set this whilst doing experimentation/ prototyping in the [scheduling]
section and then remove it when they move on to setting up [runtime]
(and this is the approach we would use in the tutorials).
Agreed this may be the best option 😠, though s/scheduling/scheduler/
.
Agreed, good solution :tada:
Do we want to keep the --strict
option for cylc validate
to disallow implicit tasks for validation (overriding [scheduler]allow implicit tasks = True
), or do we want to get rid of --strict
?
We would really like to get rid of --strict
.
Looks like --strict
is also used to warn if sequences are out of bounds for the initial cycle point, so what would happen to that? Just leave it to cylc.flow.flags.verbose
?
https://github.com/cylc/cylc-flow/blob/66c11198ebf9ce044d290b2810484c84d08c3ef0/cylc/flow/scripts/validate.py#L104-L120
Yeah, verbose will do. I'm not sure why we had that check only for --strict
- it seems pretty important
Looking at git blame, #2921 changed it from an error to a warning, so that's probably why it's only for --strict
? Should we include the warning all the time? Or there could be an option like --check-circular
called --check-sequence-bounds
?
I think always log the warning and get rid of the verbose junk and maybe get rid of the special verbose logging, not sure why that prints out the same message in a different format.
At the time of #2921 --strict
was the default for Rose installation so "log this warning all the time" was the expectation for most users.
We are hoping that --check-circular
will be the only special validation option going forward, we've only done that because it can take a while to process larger flows.
Should they stay or should they go?
https://github.com/cylc/cylc-admin/pull/111#pullrequestreview-506579884
[edit]
"Naked" tasks are tasks without an explicit runtime definition i.e in this example
bar
is a "naked" task:Naked tasks should cause configuration errors by default with this behaviour overridden by a new config:
We have referred to these as dummy or naked-dummy tasks in the past, I think we will go with "implicit" from here on?
[/edit]