LindsayHill commented 6 years ago

Below is user feedback via Slack. Posting it here so it doesn't get lost with Slack message rollover

StackStorm's New Workflows

Unit Testing

Currently, in my opinion, the biggest downfall of Mistral is the lack of unit testing around workflows. Without unit testing we're relying on smoke testing to ensure these workflows behave properly (not very effective or maintainable).

Things we would like to test in a workflow include:

Task execution yes/no
Task execution order
Error handling
Publish expressions (Jinja, YAQL)
Transition expressions (Jinja, YAQL)
Workflow inputs and outputs

I think this would require some sort of ability to mock action output so that these things can be tested.

Publish and reuse in same task

Currently, in Mistral, all of the publish and publish-on-error statements occur in bulk meaning i can't reuse one of the variables published in the current task until the next task is executed.

Example:

task1:
  action: core.local
  input:
    cmd: "echo hello"
  publish:
    output: "{{ task('task1').result.stdout }}"
    # this errors out because output isn't available in the contex yet
    hello_world: "{{ _.output + ' world' }}"

Maybe converting the publish statement to an array would make it easier to execute in sequence?

task1:
  action: core.local
  input:
    cmd: "echo hello"
  publish:
    - output: "{{ task('task1').result.stdout }}"
    # output should now be available in the current context, so the following 
    # should work
    - hello_world: "{{ _.output + ' world' }}"

When condition on a task level

Sometimes it's necessary to skip a task given a condition. Currently we have to work around this by adding this "skip" condition into every on-success statement in the workflow that may call our task. This is a maintenance burdem.

Ansible example:

- name: install stackstorm pack
  shell:
    cmd: "st2 pack install {{ st2_pack }}"
  when: st2_pack is defined

Mistral example (current implementation):

input:
  - servicenow_provision_id: null

task_1:
  action: std.noop
  publish:
    last_task: "task_1"
  on-success:
    - task_servicenow_update: "{{ _.servicenow_provision_id }}"
    - task_2

task_2:
  action: std.noop
  publish:
    last_task: "task_2"
  on-success:
    - task_servicenow_update: "{{ _.servicenow_provision_id }}"

task_servicenow_update:
  action: encore_servicenow.provision_state_update
  input:
    provision_id: "{{ _.servicenow_provision_id }}"
    current_state: "{{ _.last_task }}"

Mistral example using a when condition on task_servicenow_update task:

input:
  - servicenow_provision_id: null

task_1:
  action: std.noop
  publish:
    last_task: "task_1"
  on-success:
    - task_servicenow_update
    - task_2

task_2:
  action: std.noop
  publish:
    last_task: "task_2"
  on-success:
    - task_servicenow_update

task_servicenow_update:
  action: encore_servicenow.provision_state_update
  when: "{{ _.servicenow_provision_id }}"
  input:
    provision_id: "{{ _.servicenow_provision_id }}"
    current_state: "{{ _.last_task }}"

"Main" task / entry-point

There are many occasions on Slack where users of Mistral are confused by its default execution behavior. People usually forget to tie tasks together with explicit on-success, on-error, or on-complete statements, causing tasks to be run in parallel resulting in odd and unpredictable behaviors.

It might be better to explicity define a "main" or "entry-point" in the workflow.

description: My entry-point workflow
input:
  - x
  - y 
  - z

# the tasks that should be executed as "roots"
# this can either be a string, or a list so that >1 can be executed in parallel?
entry-point: my_first_task

tasks:
  my_first_task:
    action: std.noop
    on-success:
      - some_other_task

  some_other_task:
    action: std.noop

Linear execution by default

On the same lines as the last point about "main tasks" people new to Mistral are often confused by the non-linearity as the default operating state of the workflow engine.

Maybe as an alternative to the "main task" idea above, workflows could be executed linearly and in parallel/DAG-optimized using an option. I'm going to suggest a workflow parameter called execution_model. The execution_model with a value of linear means that tasks are executed in order just like an actionchain or Ansible. Another implementation could be the parallel execution model that will switch it over to a Mistral-like parallel execution by creating a DAG and executing as much in parallel as possible.

description: > 
  My linear workflow, all of these are executed in order
  without the need for on-success.
execution_model: linear

tasks:
  task_0:
    action: std.noop

  task_1:
    action: std.noop

  task_2:
    action: std.noop

description: > 
  Parallel execution model tries to do as much in parallel as possible, requires
  on-success and on-error to construct our DAG.
execution_model: parallel

tasks:
  task_0:
    action: std.noop
    on-success:
      - task_1

  task_1:
    action: std.noop
    on-success:
      - task_2
      - task_3

  task_2:
    action: std.noop

  task_3:
    action: std.noop

Keep: defined inputs and output

We really like having the inputs and outputs sections explicity defined. Please keep these!

Keep: allowed temporaries in workflow

We currently use (and maybe abuse) the ability of having input values defined in the workflow that are not present in the parameters of the action itself. These are mostly used as temporary variables that are local to the workflow

m4dcoder commented 6 years ago

Unit Testing -> Orchestra will be able to do multipoint inspection of syntax, expressions, and context/variables in a single pass on the workflow definition. It can also take mocked task outputs and conduct a mock run of the workflow definition. As comparison, Mistral can only find 1 error at a time and it cannot evaluate context/variables until runtime.
Publish and reuse variable in the same task -> This sounds very useful. It's not possible with Mistral because publish is a dictionary and in python there is no sort order for dictionary items. So if Orchestra is to support this, the publish will have to become a list of single item dictionary.
when condition in task and evaluated prior to task execution -> This should be feasible with Orchestra. I don't see any conflict in the underlying graph model of the workflow. We will need to explore how to implement it.
Explicit main task / entry point -> There'll be other problems with this approach. The following will need to be considered before we can agree to implementing this.
- For a workflow t1->t2->t3, what should happen to t1 if the entry point is explicitly marked t2? What if there are variables defined in the publish of t1 that t2 depends on?
- For two branches t1->t2->t3 and t4->t5->t6, what should happen to the t4 branch if only t1 is specified as entry point?
- For a join (visualize the letter y) where there are two starting tasks, what happens if only one of the task is specified as entry point?
Linear execution by default -> tasks is a dictionary and in python there is no sort order for dictionary items. Therefore, with this data structure, we can't default to linear execution. What if we change tasks to list? It maybe possible but then for more sophisticated workflows with many tasks, branches, joins, error handling, etc, I think the list and default linear order will be hard to interpret. What we can do here is to make a short hand for on-success so there's less typing but still being as explicit as possible.
Defined inputs and outputs -> Orchestra will keep these sections explicitly defined.
Local variables in workflow -> Both Mistral and Orchestra has a vars section in the language to define local variables only visible within the workflow.

cognifloyd commented 6 years ago

Perhaps the execution models are defined not by an execution_model key, but by the data type of tasks. If tasks is a list, treat it like a linear action-chain style workflow. If it's a hash/dict it should use parallel execution and tasks must be explicitly linked together with next sections.

Defining an entry-point in the workflow itself doesn't sound interesting to me, other than maybe as the default workflow entry-point. But, specifying the entry-point via a workflow input parameter would enable targeting pieces of workflows. Sure, someone could split their workflows into additional workflows to be able to select just a few tasks in the workflow. But, if a few tasks break in a workflow, and the rest of the (ran in parallel tasks) completed, it would be nice to re-run just the branch of the task graph that failed. Maybe I forgot to add a key to the key-value store, or github was temporarily overloaded/unreachable when that task ran, or ... Re-run with a target entry-point would be fantastic.

cognifloyd commented 6 years ago

:+1: to publish as a list I can't tell you how many times I've tried this:

  publish:
    output: "{{ task('task1').result.stdout }}"
    # this errors out because output isn't available in the contex yet
    hello_world: "{{ _.output + ' world' }}"

and then done approximately this:

    publish:
      output: "{{ task('task1').result.stdout }}"
    do: finish_publishing
finish_publishing
  action: noop
  next:
    publish:
      hello_world: "{{ _.output + ' world' }}"

:+1: to when conditions

:100: for Orchestra ;)

m4dcoder commented 5 years ago

We will close this issue here as this is for general user feedback during alpha/beta. If there are specific feature request or problem with orquesta, please open a separate issue for each.

StackStorm / orquesta

User feedback #17