StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
6.01k stars 746 forks source link

Stackstorm rate-limiting feature for workflow and host level #3607

Open LindsayHill opened 7 years ago

LindsayHill commented 7 years ago

(Copied manually from https://github.com/StackStorm/st2contrib/issues/647, original reporter https://github.com/sibirajal)

We have configured multiple rules/workflows/actions for our infrastructure monitoring alerts to perform the auto remediation with ST2.

The workflows are created for different alerts and we would like to have some rate limiting at workflow level and individual host level to avoid the continues remediation. If we have this feature in Stackstorm will help us to avoid masking of the real issues.

For example: Scenario 1: host level The disk alert appeared in the monitoring for serverX and St2 performed the remediation at 10 am. Same disk alert appeared in the monitoring for serverX at 10:08 am and St2 shouldn't perform the remediation. It should have some rate limiting feature to avoid the continues remediation.

Scenario 2: workflow level The disk alert appeared in the monitoring for serverX and st2 performed the remediation at 10 am using the disk_remediation_workflow.

The disk alert appeared in the monitoring for serverY and st2 performed the remediation at 10:15 am. Again the disk alert appeared in the monitoring for serverZ and st2 performed the remediation at 10:20 am. In this case we would like to have some rate limiting at workflow level to avoid the infinite time execution.

LindsayHill commented 7 years ago

/cc @sibirajal