Shopify / pyoozie

Library for querying and scheduling with Apache Oozie
https://py-oozie.readthedocs.io
MIT License
11 stars 12 forks source link

Add decision action #51

Closed cfournie closed 6 years ago

cfournie commented 7 years ago

This PR extends the available collections for Oozie workflow actions to include decision nodes (using the oozie-workflow-0.5.xsd schema) with error handling definable using an on_error parameter.

Discussion

This PR implements decision nodes and error handling such that we can accomplish one crucial execution pattern: checkpointing.

When a collection of actions are being executed in parallel and one of them transitions to a kill node, that kills the entire workflow (parallel actions and all). The other parallel actions may complete successfully, but if a single one transitions to kill it interrupts them.

Without checkpoint With checkpoint
graph4 graph3

Using the checkpointing pattern, instead of directly transition to kill within a fork, we short-circuit and transition to the join node (allowing other parallel actions to continue) and after that join we can add a decision node to check whether any previous task has encountered an error and then transition to a kill node.

Notes

This PR is one of several that are intended to add an API to define forking/joining workflows with error-handling prototyped in https://github.com/Shopify/pyoozie/pull/26.