In the interest of breaking down the virtual "barrier" between so-called "observability" rules and alerts and "stack" rules and alerts, I'd like to suggest we start showing all of these rules and alerts in the Observability UI, encouraging users to use rule tagging to control which alerts are shown in the resulting alerts table.
Context
The alerts table in the Observability > Alerts page uses the <AlertsStateTable> component to pull in alerts that match a passed in list of pre-selected "feature IDs" (which, behind the scenes, are mapped to values known as "producer" and "consumer").
Note: It's unclear why kibana.alert.rule.type, which I added here to demonstrate the rule type for the given alert, does not produce any value.
The list of feature IDs that we use to filter this set of alerts is this:
The main feature IDs we leave out of this list at the moment are one known as "MONITORING" (for explicit Stack Monitoring rule types) and one known as "STACK_ALERTS", the latter of which would bring in alerts with a "stack_alerts" producer/consumer pair. At the moment, this refers to a list of rule types that are registered by the Response Ops team's code, e.g. the ES Query rule, whenever a rule of that type is created in the Stack Management section of the Kibana app.
Problem
The problem with the current MONITORING and STACK_ALERTS feature IDs and producer/consumer values is that they mix up two different concepts:
Elastic Stack Monitoring rule types - these are rule types that are meant to explicitly monitor the Elastic stack. This would include all of the rule types within the MONITORING feature ID, as well as the "Transform Health" rule type contained within the STACK_ALERTS feature ID.
STACK_ALERTS rule types - these are rule types that happen to be registered by the "stack", i.e. the response ops Kibana plugins. This category includes the above-mentioned "Transform Health" rule type, but it also includes "Elasticsearch Query", "Index Threshold", and "Tracking Containment" rule types.
In reality, I think we have three different rule types that are available for our customers to use.
Elastic Stack Monitoring rule types (see above number 1)
Generic rule types - these are rule types that allow customers to build extremely flexible rules that use Elasticsearch queries to produce complicated rule scenarios. This includes "Elasticsearch Query", "Index Threshold", and "Tracking Containment" from the STACK_ALERTS feature ID as well as "Custom Threshold" from the OBSERVABILITY feature ID and, to some degree, the "Metric Threshold" and "Log Threshold" rule types from the INFRASTRUCTURE and LOGS feature IDs, respectively.
Specialized observability rule types - these are the rules that have been carefully set up to query observability data for a customer to use to monitor the applications and infrastructure that they are observing with the Elastic observability toolset, e.g. all of the APM rule types, all of the synthetics and uptime rule types, etc.
Because the STACK_ALERTS feature ID mixes together two of these categories (Transform Health from the first category and Elasticsearch Query, Index Threshold, and Tracking Containment from the second category), omitting all alerts created by STACK_ALERTS rule types leads to a confusing situation where alerts are omitted from the observability alerts table for seemingly no discernible reason.
The "consumer" value as a fix
There is somewhat of a fix for this problem today, and that is the fact that every rule that is instantiated has both a "producer" value and a "consumer" value. The "producer" value is static per rule type and represents where this rule type is registered in Kibana code. For example, the infra app registers the "Metric Threshold" rule and the "Log Threshold" rule, and each are given a static "producer" value (INFRASTRUCTURE and LOGS, respectively).
However, the "consumer" value for a given rule is set based on where the rule was created. In other words, if you create a metric threshold rule from the Observability rules page (/app/observability/alerts/rules), its consumer will be set to "infrastructure" (copied from its producer). However, if you create that same rule from the Stack Management rules page (/app/management/insightsAndAlerting/triggersActions/rules)
NOTE: stopped here to confirm the above and ran into some issues, will clarify
NOTE: THIS ISSUE IS A DRAFT IN PROGRESS.
In the interest of breaking down the virtual "barrier" between so-called "observability" rules and alerts and "stack" rules and alerts, I'd like to suggest we start showing all of these rules and alerts in the Observability UI, encouraging users to use rule tagging to control which alerts are shown in the resulting alerts table.
Context
The alerts table in the Observability > Alerts page uses the
<AlertsStateTable>
component to pull in alerts that match a passed in list of pre-selected "feature IDs" (which, behind the scenes, are mapped to values known as "producer" and "consumer").Note: It's unclear why
kibana.alert.rule.type
, which I added here to demonstrate the rule type for the given alert, does not produce any value.The list of feature IDs that we use to filter this set of alerts is this:
(Source)
The main feature IDs we leave out of this list at the moment are one known as "MONITORING" (for explicit Stack Monitoring rule types) and one known as "STACK_ALERTS", the latter of which would bring in alerts with a "stack_alerts" producer/consumer pair. At the moment, this refers to a list of rule types that are registered by the Response Ops team's code, e.g. the ES Query rule, whenever a rule of that type is created in the Stack Management section of the Kibana app.
Problem
The problem with the current MONITORING and STACK_ALERTS feature IDs and producer/consumer values is that they mix up two different concepts:
In reality, I think we have three different rule types that are available for our customers to use.
Because the STACK_ALERTS feature ID mixes together two of these categories (Transform Health from the first category and Elasticsearch Query, Index Threshold, and Tracking Containment from the second category), omitting all alerts created by STACK_ALERTS rule types leads to a confusing situation where alerts are omitted from the observability alerts table for seemingly no discernible reason.
The "consumer" value as a fix
There is somewhat of a fix for this problem today, and that is the fact that every rule that is instantiated has both a "producer" value and a "consumer" value. The "producer" value is static per rule type and represents where this rule type is registered in Kibana code. For example, the
infra
app registers the "Metric Threshold" rule and the "Log Threshold" rule, and each are given a static "producer" value (INFRASTRUCTURE and LOGS, respectively).However, the "consumer" value for a given rule is set based on where the rule was created. In other words, if you create a metric threshold rule from the Observability rules page (/app/observability/alerts/rules), its consumer will be set to "infrastructure" (copied from its producer). However, if you create that same rule from the Stack Management rules page (/app/management/insightsAndAlerting/triggersActions/rules)
NOTE: stopped here to confirm the above and ran into some issues, will clarify