Showcase resiliency patterns that can be configured in prefect utilizing metrics triggers, potentially composite triggers, and automations.
Want automated fail-over of scheduled work between two environments/workpools.
Set up pair of scheduled deployments (A and B) with distinct tags, and deploy them to their own respective work pools (one per environment). Deployment A will have an active schedule, deployment B will be inactive, initially. Deployment A should intentionally be designed to fail 50-75% of the time. (Is there a cleaner way of doing this, could flow-run overrides be used here to force a flow to run on a different workpool)
Set up an automation which triggers in response to a metric monitoring the success rate of flows with a certain tag (should be the tag for Deployment A). If average success percentage falls below <50%, automation action should be to pause schedule for deployment A, enable schedule for deployment B and send notification to alert of failure. Maybe include composite trigger to ensure no successful flow run events have been emitted? Can also trigger an incident to manage/triage the failure in the failed deployment.
Optional- show multi-cloud - switch to a workpool on another cloud provider if get a metric or flow that fails - critical workflow switchover to another provider. Shows resiliency that Prefect can help provide.
Showcase resiliency patterns that can be configured in prefect utilizing metrics triggers, potentially composite triggers, and automations.
Want automated fail-over of scheduled work between two environments/workpools. Set up pair of scheduled deployments (A and B) with distinct tags, and deploy them to their own respective work pools (one per environment). Deployment A will have an active schedule, deployment B will be inactive, initially. Deployment A should intentionally be designed to fail 50-75% of the time. (Is there a cleaner way of doing this, could flow-run overrides be used here to force a flow to run on a different workpool)
Set up an automation which triggers in response to a metric monitoring the success rate of flows with a certain tag (should be the tag for Deployment A). If average success percentage falls below <50%, automation action should be to pause schedule for deployment A, enable schedule for deployment B and send notification to alert of failure. Maybe include composite trigger to ensure no successful flow run events have been emitted? Can also trigger an incident to manage/triage the failure in the failed deployment.
Optional- show multi-cloud - switch to a workpool on another cloud provider if get a metric or flow that fails - critical workflow switchover to another provider. Shows resiliency that Prefect can help provide.