Closed blaggacao closed 2 years ago
(blocked by #6)
Retries and restarts are IMHO sufficiently handled by Nomad restart
and reschedule
stanzas.
Some failure handling logic can already be expressed in workflow definitions.
One thing we do want is the ability to stop running {step,workflow} instances in the Cicero web UI.
If we allow destructive transformation of the global workflow state in order to implement recursive compensation loops, such as retries or exception flows, we also need to delegate some orchestration control to the brain in order to cancel those loops on time-outs, retry exhaustion and other (potentially exogenous) interrupts.