elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.64k stars 8.22k forks source link

Rule Details Page - Phase 2 #129878

Closed emma-raffenne closed 2 years ago

emma-raffenne commented 2 years ago

This is the continuation of the work started in #129777

Feature Description

The rule definition page currently shows some data to help understand the rule and its execution. For a user to fully and clearly understand the rule, its execution history, impact, noisiness, and for better ongoing management, further details will need to be added to that page.

The following needs exist for the different personas using the rule details page.

Sub-personas or Elastic roles to keep in mind as the audience for this view -

Rule creator/editor

  1. System administrator
  2. Alert consumer with view-only access to rule details view

Acceptance Criteria

  1. Clear indication of failed alerts, exceptions, and warnings. Where possible, include trends to indicate pattern of failures (and in future, include high-level / summary reason for errors)
  2. History of rule edits and changes (including muting, disabling)
  3. Trend of number of alerts generated per rule execution, and in a given time period, to give quick indication of "load" on users from this rule (important to see this from a user's perspective: how noisy is this alert for the user?) with # of alerts generated
  4. Ability to clone the rule
  5. Ability to trigger the rule manually / ad-hoc. Note that this is different from a "preview" feature that lets a user preview the results of a rule execution, useful when the user is creating or editing the rule
  6. Ability to zoom in to a specific time range to look into generated alerts for that rule - To be confirmed by @vinaychandrasekhar

See design at elastic/observability-design#137

Implementation

fkanout commented 2 years ago

The new design Figma file https://www.figma.com/file/EE8RQphgSXcUqpvpPCybFu/Alerts-in-context---Rules-detail-page?node-id=806%3A412229

vinaychandrasekhar commented 2 years ago

Adding a couple scenario tests. We can discuss usefulness and feedback on how we make these better when we meet next.

cc @emma-raffenne @fkanout @maryam-saeidi @maciejforcone

S1 A user is getting too many alerts from a rule someone else created. This user does not have write access to the rule. From one of those alert flyouts, they click to see the rule details Expected outcome: Upon entering the rule details page, user must be able to see

  1. Rule definition
  2. Frequency of execution
  3. Number of alerts created at each execution. Trends over time would be great!
  4. Clear indication of "continuous" alerting vs. alerting only on state change
  5. Indication of when this rule was last edited and by whom

S2 An Elastic admin is getting complaints from users that a rule is not detecting/firing alerts when it is supposed to. Admin user clicks on alerts view, filters and sees that there are in fact no alerts by that rule in the time period Admin user clicks "Manage Rules" at the top, and in the rules view, filters to get to the rule of interest in the table Selects "view rule details" Expected outcome: Upon entering the rule details page, user must be able to see

  1. Rule definition, including frequency of execution
  2. Is the rule currently muted or snoozed? If so, details when the mute period started, who initiated, and when it will lapse
  3. Rule health summary - should show status of last run including when it ran last, how long it took, how many alerts generated and any indication of execution errors
  4. Number of alerts created at each execution. Trends over time would be great!
  5. Rule execution history for a selected time range. History should show - are there errors during execution? Are the errors always or sometimes? Are the errors descriptive and actionable?
  6. Indication of when this rule was last edited and by whom
fkanout commented 2 years ago

Done