StackStorm / st2ci

New and improved continuous integration actions and workflows
Apache License 2.0
3 stars 9 forks source link

Parallel promotion tasks in bwc_pkg_promote_all do not fail correctly #127

Open blag opened 5 years ago

blag commented 5 years ago

Creating an issue as per this comment.

Implementing promotion tasks in parallel does not work correctly - the process_completion task is never run, the workflow simply fails after all four tasks fail.

Here is the snippet:

  promote_all:
    next:
      - do:
          - promote_bwc_enterprise
          - promote_st2_auth_ldap
          - promote_st2flow
          - promote_bwc_ui
  promote_bwc_enterprise:
    action: st2ci.st2_pkg_promote_enterprise
    input:
      package: bwc-enterprise
      distro_version: <% ctx().pkg_distro_version %>
      release: <% ctx().release %>
      version: <% ctx().version %>
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_bwc_enterprise: <% ctx().version + '-' + result().output.revision %>
        do:
          - process_completion
      - when: <% failed() %>
        do:
          - process_completion
  promote_st2_auth_ldap:
    action: st2ci.st2_pkg_promote_enterprise
    input:
      package: st2-auth-ldap
      distro_version: <% ctx().pkg_distro_version %>
      release: <% ctx().release %>
      version: <% ctx().version %>
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_st2_auth_ldap: <% ctx().version + '-' + result().output.revision %>
        do:
          - process_completion
      - when: <% failed() %>
        do:
          - process_completion
  promote_st2flow:
    action: st2ci.st2_pkg_promote_enterprise
    input:
      package: st2flow
      distro_version: <% ctx().pkg_distro_version %>
      release: <% ctx().release %>
      version: <% ctx().version %>
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_st2flow: <% ctx().version + '-' + result().output.revision %>
        do:
          - process_completion
      - when: <% failed() %>
        do:
          - process_completion
  promote_bwc_ui:
    action: st2ci.st2_pkg_promote_enterprise
    input:
      package: bwc-ui
      distro_version: <% ctx().pkg_distro_version %>
      release: <% ctx().release %>
      version: <% ctx().version %>
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_bwc_ui: <% ctx().version + '-' + result().output.revision %>
        do:
          - process_completion
      - when: <% failed() %>
        do:
          - process_completion

  process_completion:
    action: core.noop
    join: all

    next:
      - when: <% succeeded() and     (ctx().promoted_bwc_enterprise and ctx().promoted_st2_auth_ldap and ctx().promoted_st2flow and ctx().promoted_bwc_ui) %>
        publish:
          - promoted: true
        do:
          - set_notify_success
      - when: <% succeeded() and not (ctx().promoted_bwc_enterprise and ctx().promoted_st2_auth_ldap and ctx().promoted_st2flow and ctx().promoted_bwc_ui) %>
        publish:
          - promoted: false
        do:
          - set_notify_failure
m4dcoder commented 5 years ago

@blag The reason why the join don't work is because each task on success and on failure transition to process_completion. The workflow engine is basically waiting for all the transition to reach it which in this case will never because when: succeeded and when: failed are mutually exclusive. The original workflow transition to process_completion on task completion. The equivalent in orquesta is to not leave out when in the transition so it defaults to on complete. The alternative is to create a separate process_completion_on_failure for when: failed.

blag commented 5 years ago

@m4dcoder I tried both of these parallel workflows and they both had the same behavior.

  promote_all:
    next:
      - do:
          - promote_bwc_enterprise
          - promote_st2_auth_ldap
          - promote_st2flow
          - promote_bwc_ui
  promote_bwc_enterprise:
    ...
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_bwc_enterprise: <% ctx().version + '-' + result().output.revision %>
        do:
          - process_completion
      - when: <% failed() %>
        do:
          - process_completion_on_failure
  promote_st2_auth_ldap:
    ...
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_st2_auth_ldap: <% ctx().version + '-' + result().output.revision %>
        do:
          - process_completion
      - when: <% failed() %>
        do:
          - process_completion_on_failure
  promote_st2flow:
    ...
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_st2flow: <% ctx().version + '-' + result().output.revision %>
        do:
          - process_completion
      - when: <% failed() %>
        do:
          - process_completion_on_failure
  promote_bwc_ui:
    ...
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_bwc_ui: <% ctx().version + '-' + result().output.revision %>
        do:
          - process_completion
      - when: <% failed() %>
        do:
          - process_completion_on_failure

  process_completion:
    action: core.noop
    join: all
    next:
      - do:
          - set_status_and_notify

  process_completion_on_failure:
    action: core.noop
    join: all
    next:
      - do:
          - set_status_and_notify

  set_status_and_notify:
    next:
      - when: <% succeeded() and     (ctx().promoted_bwc_enterprise and ctx().promoted_st2_auth_ldap and ctx().promoted_st2flow and ctx().promoted_bwc_ui) %>
        publish:
          - promoted: true
        do:
          - set_notify_success
      - when: <% succeeded() and not (ctx().promoted_bwc_enterprise and ctx().promoted_st2_auth_ldap and ctx().promoted_st2flow and ctx().promoted_bwc_ui) %>
        publish:
          - promoted: false
        do:
          - set_notify_failure
  promote_all:
    next:
      - do:
          - promote_bwc_enterprise
          - promote_st2_auth_ldap
          - promote_st2flow
          - promote_bwc_ui
  promote_bwc_enterprise:
    ...
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_bwc_enterprise: <% ctx().version + '-' + result().output.revision %>
      - do:
          - process_completion
  promote_st2_auth_ldap:
    ...
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_st2_auth_ldap: <% ctx().version + '-' + result().output.revision %>
      - do:
          - process_completion
  promote_st2flow:
    ...
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_st2flow: <% ctx().version + '-' + result().output.revision %>
      - do:
          - process_completion
  promote_bwc_ui:
    ...
    next:
      - when: <% succeeded() %>
        publish:
          - promoted_bwc_ui: <% ctx().version + '-' + result().output.revision %>
      - do:
          - process_completion

  process_completion:
    action: core.noop
    join: all
    next:
      - when: <% succeeded() and     (ctx().promoted_bwc_enterprise and ctx().promoted_st2_auth_ldap and ctx().promoted_st2flow and ctx().promoted_bwc_ui) %>
        publish:
          - promoted: true
        do:
          - set_notify_success
      - when: <% succeeded() and not (ctx().promoted_bwc_enterprise and ctx().promoted_st2_auth_ldap and ctx().promoted_st2flow and ctx().promoted_bwc_ui) %>
        publish:
          - promoted: false
        do:
          - set_notify_failure

Both workflows failed immediately after all four parallel tasks finished - without ever running process_completion or any subsequent tasks.

Did I misunderstand what you meant?

m4dcoder commented 5 years ago

Bug identified at https://github.com/StackStorm/orquesta/issues/112

blag commented 5 years ago

This can now proceed.