input-output-hk / cicero

event-driven automation for Nomad
https://handbook.cicero.ci.iog.io
41 stars 11 forks source link

Compensation actions: break outs #7

Closed blaggacao closed 2 years ago

blaggacao commented 3 years ago

If we allow destructive transformation of the global workflow state in order to implement recursive compensation loops, such as retries or exception flows, we also need to delegate some orchestration control to the brain in order to cancel those loops on time-outs, retry exhaustion and other (potentially exogenous) interrupts.

dermetfan commented 3 years ago

(blocked by #6)

Retries and restarts are IMHO sufficiently handled by Nomad restart and reschedule stanzas.

Some failure handling logic can already be expressed in workflow definitions.

One thing we do want is the ability to stop running {step,workflow} instances in the Cicero web UI.