[Observability] [Alert Context] Streamline the method of saving group information in alert's context

benakansara commented 4 months ago

We have different way of exposing group information as context variable in different Observability rules. We need to streamline the method of saving group information in alert's context across all Observability rules.

Rules that have group info in some form in a "separate" context variable

Custom threshold rule - context.group as an array of { field: field-name, value: field-value }
Metric threshold rule - context.groupByKeys as an object, context.group as a string
Log threshold rule - context.groupByKeys as an object, context.group as a string
Inventory threshold rule - context.group as a string

Rules without any group info in a "separate" context variable

APM Latency threshold rule - currently available as individual context variables, e.g. context.serviceName
APM Failed transaction rate threshold rule - currently available as individual context variables, e.g. context.serviceName
APM Error count rule - currently available as individual context variables, e.g. context.serviceName
Elasticsearch query rule - need to verify how group info is currently saved
SLO burn rate rule - need to verify how group info is currently saved

Acceptance Criteria

Have same context variable with same structure to represent group information across all Observability rules

elasticmachine commented 4 months ago

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

maryam-saeidi commented 4 months ago

Previously, I created an RFC for saving group information in the custom threshold. I hope we will use the same approach for future rules but not sure if it makes sense to improve this for our legacy rules as well.

jasonrhodes commented 4 months ago

sorenlouv commented 4 months ago

Out of curiosity, if every group-by field is stored at the root level identifier, what is the use-case for also storing them under kibana.alert.group? Is it important to know that an alert was created as a result of a group-by rule?

Also, I don't see why we'd suggest users to search for alerts using the kibana.alert.group.* fields. The syntax for searching for them is quite clunky, non-standard and may produce unintended results):

kibana.alert.group.field : "service.name" and kibana.alert.group.value : Frontend

versus searching top level fields which is much more straightforward and ECS compliant:

service.name: Frontend

maryam-saeidi commented 4 months ago

Out of curiosity, if every group-by field is stored at the root level identifier, what is the use-case for also storing them under kibana.alert.group? Is it important to know that an alert was created as a result of a group-by rule?

Yes, we need to know what the group by fields are to use it in places like the alert details page to show the related app context.

The main challenge is that not all the selected group fields are ECS compliant and we cannot save them in the AAD directly since we cannot have dynamic mappings for these fields since it can lead to exceeding the limit of the number of fields in a mapping. (more info in the related RFC)

As a result, we decided to save this information as kibana.alert.group.field and kibana.alert.group.value and we can still save important fields at the root level (related issue) but not all of them by default.

adamkasztenny commented 3 months ago

I think using context.groupByKeys in all the alerts would be the best option. It's easier to understand by looking at the JSON alert definition and isn't brittle like using an array (where reordering the group by fields could break the alert text).

Regardless of what we go with, it would be good to be consistent across alert types so creating alerts as JSON is easier and doesn't require a different format for each type.

maryam-saeidi commented 3 months ago

I think using context.groupByKeys in all the alerts would be the best option. It's easier to understand by looking at the JSON alert definition and isn't brittle like using an array (where reordering the group by fields could break the alert text).

For context variables, I think adding context.groupByKeys makes sense, the only case that we need to consider is if ResponseOps adds a feature for using AAD fields as context variables, how this part can be handled.

Regardless of what we go with, it would be good to be consistent across alert types so creating alerts as JSON is easier and doesn't require a different format for each type.

Agree!

maryam-saeidi commented 3 months ago

@benakansara How about considering the ES Query rule in this ticket as well?

Also, what about the SLO burn rate rule? The related SLO also has a group by field, I wonder how the related information is saved in this case.

benakansara commented 3 months ago

@maryam-saeidi This ticket should cover all Observability rules (wherever applicable). Maybe it's a good idea to list out the rules in ticket description. I'll update the same.

elastic / kibana