Open dbillor opened 5 years ago
The following is an option that we have been talking about.
tasks:
task1:
...
next:
- when: <% succeeded() %>
and:
- when: <% ctx().x = true %>
do: task2
- when: <% ctx().y = true %>
and:
- when: <ctx().z = true %>
do: task3
...
what about
task1:
next:
- when: succeeded
do: atsk1
and:
- when: succeeded
do: task2?
will this be supported? also is and needed is fear nesting whens isn't explicit enough?
@dbillor we could explore the succeeded
keyword, but the way it is now is it activates a YAQL function. We'd need to change how those primitives get activated which isn't the highest priority at the moment, but something to consider. As for implicit and
it will have to be nested under some dictionary key since the data type you're in is a dict
. So having some implicit parameter will still require a key for the list to be nested under which would defeat the point.
To be more clear on that last point: next
is a list/array of dictionaries/objects. So you can't have a list/array (of further conditions) among your dictionary/object data type. So we have to nest that array/list under a new key. So there's no clear way of having a shorthand there.
Here is a snippet of our Orquesta code that is already in production:
- when: <% succeeded() and (ctx().merge_upstream and ctx().mis_repo_base = openstack) %>
publish:
- shas: <% result().output.shas %>
do:
- merge_upstreams
- when: <% succeeded() and (not ctx().merge_upstream or ctx().mis_repo_base = stackstorm) %>
publish:
- shas: <% result().output.shas %>
do:
- teardown
That could be simplified down to:
- when: <% succeeded() %>
publish:
- shas: <% result().output.shas %>
and:
- when: <% ctx().merge_upstream and ctx().mis_repo_base = openstack %>
do:
- merge_upstreams
- when: <% not ctx().merge_upstream or ctx().mis_repo_base = stackstorm %>
do:
- teardown
Instead of nesting conditions linearly within parentheses, we're nesting conditionals in YAML using, and YAML is a syntax that was built on using indentation to nest elements. We would be utilizing YAML for exactly what it was originally intended.
And then when our customers ask us "why use Orquesta instead of Mistral?" we can have a specific answer for them "nested conditionals in task transitions"!
And just to drive the point home further, it would be difficult for Mistral to implement a similar syntax. So not only would Orquesta be getting a feature that Mistral doesn't currently have, it would be getting a feature that Mistral could pretty much never have.
However, I don't really know why succeeded
and failed
keywords would be more useful than succeeded()
and failed()
functions. If we kept them functions, it becomes much more obvious that they are the results of some sort of execution. If we switch to (or additionally allow the use of) keywords, then it's not as readily apparent that the keyword means a result that is calculated on the fly. It would also be inconsistent with other functions that also really make more sense as functions: ctx()
, item(...)
, result()
, task(...)
.
@dzimine @bigmstone Any feedback for us on the proposed design above?
I'm not sure how I feel about and
syntax. and
seems like it's wanting all conditions to match to me. Could we next next
again? Confusing?
@bigmstone I haven't forgotten your feedback above, i'm just not sure yet.
Also, as an alternative, user can also do the following right now where task1 publish and transition on succeeded to an empty task2. Task2 (no action defined) branches out with other publishes and transitions.
task1:
action: ...
next:
- when: <% succeeded() %>
publish:
- shas: <% result().output.shas %>
do: task2
task2:
next:
- when: <% ctx().merge_upstream and ctx().mis_repo_base = openstack %>
publish: ...
do:
- merge_upstreams
- when: <% not ctx().merge_upstream or ctx().mis_repo_base = stackstorm %>
publish: ...
do:
- teardown
As per @m4dcoder suggestion, I am adding my 2 cents to publishing vars. Instead of having them as part of the transition model, I'd have it as part of the task model. So that once action is complete, regardless of whether it's success or failure, evaluate publish expressions, which then could be available to transition conditions. It'd make it simple to write workflows for everyone.
@emptywee The issue here is that result of action execution can be different for success and failure. What if I want to publish x on success (only in the result of successful execution) and y on failure (only in the result of a failure execution)? In this case, the workflow engine will try to process the publish for x and y and will result in an expression evaluation failure in either case.
@m4dcoder I don't know about those hypothetical scenarios where you'd want to publish different variables, but in my experience I've rarely needed it (if at all, to my recollection). I've been able to write pretty complex workflows with just using "publish" attribute of the mistral engine, dealing with all possible success/failure transitioning and variables publishing. However, I see many ways to deal with such scenarios with a single publish attribute in the task model. E.g.
task1:
action: example.randomly_fail
publish:
var_on_success: <% succeeded() and result().output.var1 %>
var_on_failure: <% not succeeded() and result().output.var2 %>
Then it's up to you which variable to use later on.
@emptywee You are not doing this doesn't mean it is not done by other users though. Among other use cases, it's common to publish certain variables on success but then publish stdout/err on failure to be logged or recorded. As for your example, it doesn't work because you are mixing condition with the expression that returns the value for the variable. In your example, var_on_success/failure will have value of True or False.
@m4dcoder well, I expected this kind of response, thus was a bit hesitant to ignite this sort of debate :) Of course, if I needed to get rid of True/False, I'd make it:
task1:
action: example.randomly_fail
publish:
var_on_success: <% succeeded() and result().output.var1 or null %>
var_on_failure: <% not succeeded() and result().output.var2 or null %>
I know the output I am gonna get in my expressions, and I'd account for that if I needed them to be very precise. For the sake of example I provided a very simple one.
It's just, I don't know what's cheaper...: a. have it split separately in different whens and not being able to use published vars in when conditions, adding nested dummy tasks for additional transitions, or; b. write appropriate expressions, define variables, which can be used in when conditions, basically reusing the code (which sometimes could be very complex queries from resulting data), save tens of duplicate lines.
Perhaps, as a compromise, allow publishing in the task model and additional publishing in the transition model? I don't know if it's possible, though, from the Orquesta engine implementation standpoint of view.
This type of design discussion is needed to understand requirements and evaluate solution proposals. The person who is going to implement this feature will be able to look at the dialogs here to understand how the decision was made. If we have this conversation in Slack, it will be lost.
I tested the YAQL expressions and it works to my surprise. This also needs to be tested for Jinja since we support both YAQL and Jinja.
IMO, the YAQL expressions are very ambiguous and I have not formed an opinion on this proposal yet.
@m4dcoder I like YAQL, it allows to be pretty flexible when needed. Wish it could also let your handle exceptions :)
Regardless, could you consider the compromise suggested? Let users publish at two stages: 1. after action is complete; 2. in the transition part (inside when
clauses). I think it'd satisfy all parties.
I don't have an opinion on your proposed solution atm. There is a workaround for the problem you stated. I am in no rush to make a decision on this especially introducing a new language element which I cannot rescind once I introduced it. This requires more thoughts and testing to make sure Jinja also works because we support both type of expressions.
Well, the compromise where users would be able to publish in two places introduces no changes to YAQL/Jinja processing. Only that if publish:
is defined within the task itself, run the expressions and update the context, then move on as it was before. Sounds simple, but I, of course, don't know the details of the engine implementation.
P.S. As much as you don't want to rush make changes, I also don't want to use lengthy and bulky workarounds just to later re-do it the right way (if additional publish is added eventually). If you'd like users to migrate smoother to Orquesta, this little feature is a must have.
P.P.S. Also retry-delay-count feature is long-awaited here :)
Expressions in the next
block can be very repetitive. The same condition ends up getting repeated in several when
and publish
expressions.
So, I often end up splitting one task into two (or more). The first publishes the results in more convenient (intermediate) variables, and the second serves as a fork/decision node to evaluate those newly published (intermediate) vars, and the publish the vars that the rest of the workflow needs before continuing to the next set of tasks.
One issue with this workaround is that it pollutes the context with variables that I only need for the next (noop
) task. I would like to avoid increasing the size of the context wherever I can.
The nested-when block has a similar issue. So, using the next
block format from blag's example (https://github.com/StackStorm/orquesta/issues/116#issuecomment-465429738) mixed with vars of some of our workflows:
next:
- when: <% succeeded() %>
publish:
- disk_space_usage: <% int(result().get(ctx().server_name).stdout.replace('%','')) %>
and:
- when: <% ctx().disk_space_usage >= ctx().disk_space_threshold and ctx().use_chatops %>
publish:
- full_msg: "{{ ctx().disk_space_usage }}% used ({{ 100 - ctx().disk_space_usage }}% free)"
do:
- send_chatops_confirm_disk_usage_too_high
- when: <% ctx().disk_space_usage >= ctx().disk_space_threshold %>
do:
- clean_up_logs
- restart_flaky_service
If I never need the disk_space_usage
var outside of this next[].and
block, then we have to deal with an ever increasing number of short-lived vars that get added to the context. For disk_space_usage
, an int, that size is not very concerning. Looping over a subworkflow for a list of hosts, and then extracting data from the return dictionaries about each host can be significant, and can require repeatedly accessing nested (result().result.select(...).a.b.c.d.e.f
) information.
The mistral-like publish-first approach described by Igor (https://github.com/StackStorm/orquesta/issues/116#issuecomment-499291613) also suffers the same shortcoming; publishing a var that is meant to be used in the current task's when
condition even though that var might not be used later.
task1:
action: example.randomly_fail
publish:
var_on_success: <% succeeded() and result().output.var1 or null %>
var_on_failure: <% not succeeded() and result().output.var2 or null %>
So, I would like to see a way to not only create variables for use in next[].when
, but also scope those variables to the current task and NOT publish them to the workflow context.
What if we added a way for workflow authors to "override" what the result()
function returns? Or, make a new function that returns these task-scoped vars?
This could work with either approach: (a) the nested-when approach, and (b) the publish-before-next approach.
Here are the above examples using a new output()
function with a new output:
block in each of the above examples:
(a) nested when
next:
- when: <% succeeded() %>
output:
- disk_space_usage: <% int(result().get(ctx().server_name).stdout.replace('%','')) %>
and:
- when: <% output().disk_space_usage >= ctx().disk_space_threshold and ctx().use_chatops %>
publish:
- full_msg: "{{ output().disk_space_usage }}% used ({{ 100 - output().disk_space_usage }}% free)"
do:
- send_chatops_confirm_disk_usage_too_high
- when: <% output().disk_space_usage >= ctx().disk_space_threshold %>
publish:
- disk_space_usage: <% output().disk_space_usage %>
do:
- clean_up_logs
- restart_flaky_service
(b) define the variables before the next block
task1:
action: example.foobar
input:
param: abcdef
output:
var_on_success: <% succeeded() and result().output.var1 or null %>
var_on_failure: <% not succeeded() and result().output.var2 or null %>
next:
- when: <% succeeded() and output().var_on_success = "qwerty" %>
do: ...
- when: <% succeeded() and output().var_on_success = "dvorak" %>
do: ...
Instead of key task.output
+ func output()
, it could also be key task.result_override
and reuse result()
. Whatever the name, I'd like a way to create task-scoped vars that can be used in task.next[].when
and task.next[].publish
.
I think there are two separate problems communicated here.
The current semantic for when
allows user to define what is success or failure. For example in some use cases, an action execution may complete (return success in StackStorm) but the output of the execution contains status code and or message that indicates whether the action actually accomplishes what it intended to do. The term on-success
and on-failure
does not allow for these use cases. Plus the expression in the example above is a mix of boolean eval and value assignment which will add confusion. Therefore I prefer we keep iterating on the approach with the nested when.
This issue has been brought up again in the community Slack channel. Please note that there is a workaround at https://github.com/StackStorm/orquesta/issues/116#issuecomment-478713066 which uses intermediate tasks.
So far, the Orquesta workflow definition language has avoided hardcoding on-success and on-error. I like to keep it that way. There are use cases where tasks completed successfully but the returned status code or the output data determines the state.
So let's revisit a similar proposal from above and expand this here with scoped publish. The proposal is to introduce a new keyword then
(mutually exclusive to do
) which allows for further branching using a list of task transition block (when->publish->do|then). The variables that are published in a task transition block will be available in the branches that follow.
Let's take the following workflow as an example, with branches S, F, SX, SY, and SYZ where SYZ means it falls under the umbrella of S and SY.
tasks:
task1:
...
next:
# branch S
- when: <% succeeded() %>
publish:
- var1: blah
then:
# branch SX
- when: <% ctx().x = true %>
publish:
- var2: blahblah
do: task2
# branch SY
- when: <% ctx().y = true %>
publish:
- var3: blahblahblah
then:
# branch SYZ
- when: <ctx().z = true %>
do: task3
...
# branch F
- when: <% failed() %>
publish:
- var99: mehhhh
do: notify
In the example above, var1 is published in S and available in SX, SY, and SYZ. The variable var2 is published in SX but since SY and SYZ is not under SX, var2 is not available in those branches. Branch F publishes var99 but it is not available in the scope of the branches under branch S.
S publishes var1
- SX (context contains var1) publishes var2
- SY (context contains var1) publishes var3
- SYZ (context contains var1, var3)
F publishes var99
As for local variables not to be published into the workflow context, how about prefixing the variable with an underscore and then reference with the ctx()
function like any other variables? Any variables prefix with the underscores will be remove from the context on exiting the task evaluation.
tasks:
task1:
...
next:
- when: <% succeeded() %>
publish:
- _var10: fee fi
- _var20: fo fum
then:
- when: <% ctx().x = true %>
publish:
- poem: <% ctx()._var10 + " " + ctx()._var20 %>
do: task2
- when: <% ctx().y = true %>
do: task3
...
I like then:
better than and:
for nesting conditions.
Using an underscore prefix seems surprising at first, but I see how it prevents adding a new keyword to the workflow syntax. I like that there's only one place to define those vars, and the underscore is a nice scope indicator.
I like the then
transition syntax, as it looks fairly clean and it would reuse chunks of the existing schema definition.
I assume that most Python developers will be very familiar with the leading-underscore-meaning-something-different convention, because that's a convention in Python. AIUI that is not a formal restriction of Python interpreters, it is merely a convention respected by most Python developers (but not all).
However, I do not like the leading-underscore part of the then
proposal though.
Reasons:
Pardon my ignorance here, but I don't understand the reservation against adding keywords to the Orquesta syntax. The syntax is, as far as I can tell, very good at separating user-defined names (like published variable names and values) from syntactical keywords (like tasks
, next
, when
, publish
, do
, and now then
and also disallowing additional arbitrary keys (read: uses additionalProperties: false
everywhere in the schema definition). So the usual reason to avoid adding a keyword - that new keywords might conflict with existing variable names in existing workflow definitions and break those existing workflows - probably isn't an issue for us.
On the contrary - AFAICT that is an additional restriction on context variable names that did not exist before. So existing workflows that already define variables prefixed with underscores will break under the current then
proposal when they are removed from downstream task contexts. A workaround would be to only remove those variables from downstream contexts in then/when/do
transitions, but keep them in downstream contexts in when/do
transitions. That inconsistency might confuse new users who either (A) expect underscore-prefixed context variables to be kept in then/when/do
transitions, (B) expect underscore-prefixed context variables to be removed in when/do
transitions, or (C) both. And I could see (B) leading to a user security issue if, for example, they use a secret in a transition and expect it to be removed from downstream contexts. So that workaround is not without its own set of issues.
A different way to handle variable scopes would be to extend the publish
item syntax from being a list of single-item dictionaries to being a list of single-item or multiple-item dictionaries:
tasks:
task1:
...
next:
- when: <% succeeded() %>
publish:
# current, "short-form" syntax; still available in downstream contexts as the
# implicit "scope" defaults to "task" (see below)
- _var10: fee fi
# equivalent to ^^ syntax
# long-form syntax
# allows users to explicitly set variable scope
# publishing a variable with the same name more than once would not be allowed
# in practice, this is just to show an equivalent)
- name: _var10
value: fee fi
scope: task # set to "task" to include in downstream task variable contexts
# long-form syntax
# allows users to scope variable down to `then` transition
- name: _var20
value: fo fum
scope: transition # set to "transition" to remove from downstream task variable contexts
then:
...
There would be no other acceptable values for the scope
key, and the long-form syntax would require all three name
, value
, and scope
keys.
This sub-proposal has the benefits of not introducing any additional keywords to indicate transition variable scope, avoids the underscore-prefix restriction from the previous proposal, it should be additionally backwards compatible with existing workflows since it only extends the existing syntax, it makes it absolutely explicit when a variable will not be published to downstream task contexts, and it also gives us a way potential mechanism to extend the syntax if we would like to allow any additional variable scopes or other variable handling tricks in the future.
The downsides to this option are that introduces an additional mechanism to publish variables and it forces users to write a lot of boilerplate when they need to specify scope: transition
.
Technically, the long-form syntax for scope: task
isn't necessary, so that part could be omitted if need be.
I apologize if any if this is inaccurate (please feel free to correct me or simply ignore if so). It's been awhile since I was involved in StackStorm development and I'm not as up-to-speed on it as I used to be.
publishing a variable with the same name more than once would not be allowed
Publishing the same var multiple times is a feature of the list of single item dicts. I purposely publish the same var multiple times in the same publish block. This allows me to break a long yaql or jinja expression into a series of smaller more understandable expressions.
The added verbosity of name
, value
, scope
would make those expressions less readable I think.
An _
prefix keeps the published value expressions easy to read, but is problematic because _
is a valid character in var names. Valid characters include letters, numbers, and _
:
https://github.com/StackStorm/orquesta/blob/c548e49bc649dbfb8f9f92b391d2359efaa9e943/docs/source/schemas/orquesta.json#L280
What if we used something that is NOT a valid var name character (and not a special yaml char)? Here are several possibilities (we should only use one).
publish:
- _var10: fee fi # _ is a valid var name char so no special meaning
- ~var20: fo fum # ~ is not a valid var name char. publish ~var_name access ctx().var_name
- %var30: I smell the # publish %var_name access ctx().var_name
- <var40>: blood of # or surround the name instead of prefixing it. publish <var_name> access ctx().var_name
- local:var50: an englishman # publish scope:var_name access ctx().var_name
In light of @blag comment, I believe something like local().var_name
would work for everyone. It'd be in line with global ctx().var_name
style/approach.
@copart-jafloyd I definitely agree - the long-form is definitely verbose and less readable. The unique variable name restriction is not important to my overall sub-proposal, so keeping that functionality working for you is totally fine, I'm just curious how/why you actually use that. Can you give an example of how you publish the same variable multiple times?
I'm glad we're talking about new options, especially the point about using a non-allowed character as a sigil but I think anything that uses a symbol to indicate scope is going to confuse users and likely be difficult to search for.
Out of those options, I like the local:var_name
pattern the best. It gives a clear term (local
) and a clear context ("Orquesta workflow transition") when searching for help, and it shouldn't break existing workflows because the colon character is not a valid character in variable names.
Along that train of thought, trying to think outside the box a little bit here: the term "local" might not make it exactly clear as to what is considered "local". Local to the task being run? Local to the publish
block? Local to the workflow? Perhaps a better prefix might be something like scope:transition:var_name
and (implicitly) scope:task:var_name
? Or maybe just transition:var_name
?
@emptywee I'm not sure I like the alternative local().var_name
proposal. I think if we're going to have a separate way to reference variables, then we should have a corresponding and completely separate way to publish variables.
Example:
next:
- when: ...
publish:
- var1: fee fi
local: # or maybe this should be "transition:"?
- var2: fo fum
then:
- when: <% local().var2 and ctx().var1 %>
@blag, yeah, right, we're talking about declaration here, my bad.
As to using :
in variables declaration I believe yaml wouldn't be too happy about it, unless you wrap it in quotes, which may look ugly.
Re YAML, you only have to wrap the :
in quotes if there is a space after it. So, local: var_name
would be a non-starter, but local:var_name
should be fine.
I don't think transition
makes sense for the scope. A "transition" scope sounds like it would be a variable that becomes available for any subsequent task. Maybe we could call it the scope:then
or scope:next
. I guess the docs do use the term "Task Transition Model", so there is precedence for "transition", but I always called it a "next block".
Whatever the scope's name is, it will need to be documented. Borrowing a term or a symbol from a programming language is fine, as long as the term's definition, within the linguistic-context of orquesta, is documented.
PS: @copart-jafloyd is me. I just posted the earlier comment from the wrong account. oops.
Right now when writing check logic in orquesta I have to have a when block for each conditional. It's not graceful and harder to manage. For ex: Instead of ,
We have to do
It's not clean and harder to read/maintain. Especially in some of our workflows the conditionals are large and you end up having many statements with repeated keywords in them.