canonical / kratos-operator

Charmed Ory Kratos
https://charmhub.io/kratos
Apache License 2.0
3 stars 1 forks source link

Misleading availability alerts #154

Closed nsklikas closed 4 months ago

nsklikas commented 1 year ago

Enhancement Proposal

The KratosUnavailable-multiple alert rule (https://github.com/canonical/kratos-operator/blob/main/src/prometheus_alert_rules/kratos_unavailable.rule#L11) is misleading in the sense that it is triggered any time more than 1 units are unavailable.

IMHO the granularity of our availability alert rules (we have 4 levels of availability) does not make much sense and I fail to see what they offer in terms of deployment availability. After discussing it with @shipperizer I tend to agree that our alerts should be based on percentages of availability. It is very different when 1 out of 3 units and when 1 out of 20 units is available, it is also very different when 1 out of 3 units is unavailable and when 1 out 20 is unavailable. With the current alerts it is impossible to tell these cases apart and both would yield the same alerts. If our alerts were based on percentages, one alert would offer a much clearer view of the problem.

syncronize-issues-to-jira[bot] commented 1 year ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/IAM-488.

This message was autogenerated