elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.77k stars 8.17k forks source link

[Custom threshold][Alert details page] Bug in alert details page main and history charts when rule is edited #181828

Open maryam-saeidi opened 5 months ago

maryam-saeidi commented 5 months ago

Summary

When we update a rule definition from not having a group to having a group by field, we save these new rule parameters in the recovered alert document but we need to keep the previous rule definition.

Steps to reproduce:

  1. Create a custom threshold without group by field and fire an alert
  2. Edit the rule and add a group by field
  3. If you run the rule again, the alert generated in the first step is recovered but the charts show the data for different groups instead of showing it for all data without the group.

image

image

elasticmachine commented 5 months ago

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

maryam-saeidi commented 4 months ago

Related to https://github.com/elastic/kibana/issues/183111

jasonrhodes commented 4 months ago

@maryam-saeidi with that related issue, do you think these will still need two separate fixes or one PR for fixing both?

jasonrhodes commented 4 months ago

After talking to @kobelb, the @elastic/response-ops-management-experiences team will look at fixing this (for recovered, maybe untracked also?)

maryam-saeidi commented 4 months ago

@maryam-saeidi with that related issue, do you think these will still need two separate fixes or one PR for fixing both?

We can fix them together for both rules (custom threshold and metric threshold); we still need to check if we have similar issues with other rules.

ymao1 commented 4 months ago

@maryam-saeidi We'll be tackling this for 8.16

mikecote commented 4 months ago

Adding this to 8.16 planing, the fix we believe would solve 80% of the issues would be to not update kibana.alert.rule.* when an alert recovers. We'll start with that, let us know if you don't think this would work.

jasonrhodes commented 1 month ago

@mikecote just wanted to confirm if this is still planned as a fix for 8.16? if so, we'll remove from our board

mikecote commented 1 month ago

@mikecote just wanted to confirm if this is still planned as a fix for 8.16? if so, we'll remove from our board

@jasonrhodes we have it as a should in our 8.16 plans, feel free to remove from your board 👍 if we don't get to it by 8.16 we'll keep track of the issue.

ymao1 commented 1 day ago

@maryam-saeidi I made a quick update to not copy the rule parameters to the recovered rule and I see the rule parameters for my recovered rule do not include the group by field but the chart still does. Not sure how the chart gets populated?

Alert metadata, including rule params without a group by: Image

Chart on alert details page still has grouping? Image

maryam-saeidi commented 1 day ago

@ymao1 Can this be related to the kibana.alert.group field?

Here is the code related to this chart.

ymao1 commented 1 day ago

Here is the alert document for the recovered alert:

{
        "_index": ".internal.alerts-observability.threshold.alerts-default-000001",
        "_id": "5d847a45-df5a-4522-b222-b8cbe52f6ab9",
        "_score": 1,
        "_source": {
          "kibana.alert.reason": "Document count is 33, above the threshold of 0. (duration: 1 hr, data view: event log)",
          "kibana.alert.evaluation.values": [
            33
          ],
          "kibana.alert.evaluation.threshold": [
            0
          ],
          "tags": [],
          "kibana.alert.rule.category": "Custom threshold",
          "kibana.alert.rule.consumer": "logs",
          "kibana.alert.rule.execution.uuid": "99269e93-2611-4e62-98f6-863e5d8658af",
          "kibana.alert.rule.name": "test",
          "kibana.alert.rule.parameters": {
            "criteria": [
              {
                "comparator": ">",
                "metrics": [
                  {
                    "name": "A",
                    "aggType": "count"
                  }
                ],
                "threshold": [
                  0
                ],
                "timeSize": 1,
                "timeUnit": "h"
              }
            ],
            "alertOnNoData": false,
            "alertOnGroupDisappear": false,
            "searchConfiguration": {
              "query": {
                "query": "",
                "language": "kuery"
              },
              "index": "6df1535c-05a4-4f10-9815-7da255391588"
            }
          },
          "kibana.alert.rule.producer": "observability",
          "kibana.alert.rule.revision": 0,
          "kibana.alert.rule.rule_type_id": "observability.rules.custom_threshold",
          "kibana.alert.rule.tags": [],
          "kibana.alert.rule.uuid": "f0df21ca-3215-49c1-a062-9de5419213b7",
          "kibana.space_ids": [
            "default"
          ],
          "@timestamp": "2024-10-04T15:04:33.521Z",
          "event.action": "close",
          "event.kind": "signal",
          "kibana.alert.rule.execution.timestamp": "2024-10-04T15:04:33.521Z",
          "kibana.alert.action_group": "recovered",
          "kibana.alert.flapping": false,
          "kibana.alert.flapping_history": [
            true,
            true
          ],
          "kibana.alert.instance.id": "*",
          "kibana.alert.maintenance_window_ids": [],
          "kibana.alert.consecutive_matches": 0,
          "kibana.alert.status": "recovered",
          "kibana.alert.uuid": "5d847a45-df5a-4522-b222-b8cbe52f6ab9",
          "kibana.alert.severity_improving": true,
          "kibana.alert.workflow_status": "open",
          "kibana.alert.duration.us": 36555000,
          "kibana.alert.start": "2024-10-04T15:03:56.966Z",
          "kibana.alert.time_range": {
            "gte": "2024-10-04T15:03:56.966Z",
            "lte": "2024-10-04T15:04:33.521Z"
          },
          "kibana.version": "9.0.0",
          "kibana.alert.previous_action_group": "custom_threshold.fired",
          "kibana.alert.end": "2024-10-04T15:04:33.521Z"
        }
      }

I don't see a kibana.alert.group field?