elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.72k stars 8.13k forks source link

[Meta] Alerting-Infra requirements #76914

Open gmmorris opened 4 years ago

gmmorris commented 4 years ago

This is a meta issue for the Alerting team to keep track of the different requirements raised by Infra.

Below is a table of requirements identified in @Crazybus 's POC into using Alerting for Infra's needs:

Status Requirement Owning Team Relevant Issue(s) Notes Infra Priority
Open Alerting is not separated out per monitor in Uptime Monitor Alert Type @elastic/uptime https://github.com/elastic/kibana/pull/74659 Might have been addressed by https://github.com/elastic/kibana/pull/74659 🎉 Medium
Closed. Should be delivered in 7.11 Close a PagerDuty incident when a monitor's Alert resolves @elastic/kibana-alerting-services https://github.com/elastic/kibana/issues/49405 https://github.com/elastic/kibana/issues/76908 https://github.com/elastic/kibana/issues/77772 We will first have to add Open Blocker
Open Separate criteria for "down" and "recovery" in Uptime Monitor Alert Type (error rate is over 75% for an event window, recovery requires 0% for the same event window.) @elastic/uptime ?   High
Open There is no Certificate expiry Alert Type (@elastic/uptime ? )   Someone will have to develop this custom Alert Type, it might be APM, but I don't want to speak for them. 🤷‍♂️ low
Open Programmatic creation of Alerts @elastic/kibana-alerting-services   We do provide a full HTTP API and we have the CLI tool @pmuellr created, which isn't officially supported by the team. We will discuss this in the Alerting team. Blocker
Open Adjust the sensitivity/severity and action configuration of alerting per monitor @elastic/uptime & @elastic/kibana-alerting-services   This might require some investigation and clearer requirements, but this should be possible as far as the framework is concerned, but would require work on the APM side. I suspect it the implementation might be blocked on https://github.com/elastic/kibana/issues/64077, but that totally depends on how APM choose to implement this. Medium
Open Alerting logic that will handle flappy alerting with different collection durations @elastic/uptime ?   High
Open Ability to include extra information links to documentation for troubleshooting the down service or links to other places like our Inventory in Pager Duty Action @elastic/kibana-alerting-services https://github.com/elastic/kibana/issues/76910   High
Open Creating alerts per multiple columns in Metric Threshold Alert Type @elastic/uptime ?   medium
Open Preview in Metric Threshold Alert Type @elastic/uptime ?   low
Open Index Threshold doesn't support filtering @elastic/kibana-alerting-services https://github.com/elastic/kibana/issues/66046 This is hasn't been prioritised for 7.x, as we're hoping APM's AlertTypes can cover your needs, but I'll raise that this has come up again and we might reprioritise medium
Open There is no alert type which supports Elasticsearch Query DSL combined with Painless scripting @elastic/kibana-alerting-services https://github.com/elastic/kibana/issues/61313 high
mikecote commented 4 years ago

@gmmorris this is great! Can we also update the team dependencies on alerting meta issue as well and add a section there? We're be using that issue as the source of truth but it can reference this issue for further details (similar to SIEM).