StackStorm / orquestaconvert

Converts Mistral workflows into Orchestra workflows
Apache License 2.0
13 stars 7 forks source link

Add retry handling #27

Closed blag closed 4 years ago

blag commented 4 years ago

Now that Orquesta has retry support, this PR adds support for converting Mistral's retry attribute.

Mistral retry:

task1:
  action: my_action
  retry:
    count: 10
    delay: 20
    # if evaluates to true, task is considered to have failed and execution proceeds to the next task
    break-on: <% $.my_var = "BREAK" %>
    # if evaluates to true, task is considered successful and execution proceeds to the next task
    continue-on: <% $.my_var = "continue" %>
  on-success: task2

Orquesta retry:

  task1:
    action: core.noop
    retry:  
      when: <% (succeeded() and (ctx().my_var = "continue")) or (failed() and not (ctx().my_var = "BREAK")) %>
      # ^^^^ optional, defaults to <% failed() %>, determines whether to retry task again
      delay: 1  # optional
      count: 3  # required
    next:
      - when: <% succeeded() %>
        do: task2

Also updates the README.md to note the new retry support.

As a bonus, this updates tests to use unittest2.TestCase.assertEqual, runs tests on Python3 (only, although the utility itself does still run on Python 2), and updates setup.py with PyPI classifiers and an explicit description.

I also uploaded version 0.1 to PyPI: https://pypi.org/project/orquestaconvert. I'm not 100% sure I got the scripts directive correct, but it's a start.

codecov-io commented 4 years ago

Codecov Report

Merging #27 into master will increase coverage by 0.06%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #27      +/-   ##
==========================================
+ Coverage   98.74%   98.81%   +0.06%     
==========================================
  Files          12       12              
  Lines         637      673      +36     
==========================================
+ Hits          629      665      +36     
  Misses          8        8
Impacted Files Coverage Δ
orquestaconvert/workflows/base.py 100% <100%> (ø) :arrow_up:
orquestaconvert/pack_client.py 92.47% <100%> (ø) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 42a7ba2...b5a681f. Read the comment docs.

blag commented 4 years ago

@m4dcoder I think you might be reviewing the code before I rebased, but that's close.

For both the break-on and continue-on expressions, the converter "unwraps" the expression (eg: removes the <% and %> or the {{ and }}), and then it attempts to invert the logic of the expression. For "simple" expressions, yes, it tries to break the expression into the left hand side, the operator, and the right hand side.

For more complex (YAQL-only) expressions, we can simply invert the expression with not (...). Unfortunately for complex Jinja expressions we don't have a good way to invert the expression logic. In that case we throw a NotImplementedError with a very descriptive error message. Or if the user passed in the force flag, we just emit a warning, copy the Jinja expression straight as thewhen expression and add a comment that the user needs to manually invert the expression themselves.

My understanding of Mistral retry is that continue-on and break-on apply when the task is already being retried, and determine when and how to exit the loop. If the continue-on expression evaluates to true the task is considered to have succeeded (and the next on-success transition is conducted), and if the break-on expression evaluates to true the task is considered to have failed (and the next on-error transition is conducted).

For Orquesta, I understand it to be kind of the opposite - retry.when specifies whether or not the task should be retried or not, and the task's normal next.when attribute controls whether or not the task is considered to have succeeded or not.

Is all of that correct?

m4dcoder commented 4 years ago

For Orquesta, I understand it to be kind of the opposite - retry.when specifies whether or not the task should be retried or not, and the task's normal next.when attribute controls whether or not the task is considered to have succeeded or not.

The order of precedence is evaluation of the retry element and then next. If the when condition of retry is met, then orquesta will simply retry the task without evaluating any task transitions under next.

My understanding of Mistral retry is that continue-on and break-on apply when the task is already being retried, and determine when and how to exit the loop. If the continue-on expression evaluates to true the task is considered to have succeeded (and the next on-success transition is conducted), and if the break-on expression evaluates to true the task is considered to have failed (and the next on-error transition is conducted).

The continue-on applies if it is defined and task state is SUCCEEDED. The break-on applies when task state FAILED and is evaluated before retry started. Before task is retried, the number of retries, continue-on and break-on are evaluated together first. Ref: https://github.com/StackStorm/mistral/blob/master/mistral/engine/policies.py#L388

For more complex (YAQL-only) expressions, we can simply invert the expression with not (...). Unfortunately for complex Jinja expressions we don't have a good way to invert the expression logic. In that case we throw a NotImplementedError with a very descriptive error message.

This is what I expected for YAQL. Can you give me an example of Jinja where we cannot simplify negate the break-on expression? The ideal here is to convert continue-on to <% succeeded() and continue-on expression %> and break-on to <% failed() and not break-on expression %>.

nmaludy commented 4 years ago

@m4dcoder @blag where do we stand with this?

blag commented 4 years ago

Totally dropped this. Will circle back around to it.

blag commented 4 years ago

Updated to match @m4dcoder's comments.

m4dcoder commented 4 years ago

@blag LGTM. Can you update the PR description above so the orquestra retry converted example matches the mistral retry example?

blag commented 4 years ago

@m4dcoder Good catch, thanks! Updated PR description! 👍