Open fourstepper opened 1 month ago
I think it would make sense to convert a semantic severity to the Opsgenie or PagerDuty representation of the severity. E.g. always define it in critical
, high
, low
etc. You could then map critical
to SEV-1
for PagerDuty and P1
for OpsGenie
That's an interesting idea, although I'm a bit worried about keeping OSKO up to date with all available providers and their severity levels.
To make it more clear for an outside perspective - we are trying to implement the multi-window, multi-burn-rate alerting here in an automatic fashion based on an existing SLO. In theory (and in our practice) having this as a config option would be good enough, but other usecases might differ.
I would use the defaults as critical
, high-fast
, high-slow
, low
, and no_slo
. This is taken from the SRE book from the alerting chapter. They are using it to sort the SLOs, but we can also use it as a default severities.
Do we think we'll ever need to use more than one alerting tool at a time? 🤔
The defaults are one thing, but how to configure these is a whole other thing that I think is the bigger issue here 🤔
Regarding your question, that's really hard to say. I would focus on supporting one at a time for now. If we can stay backwards compatible with supporting one, setting the various severities in the config, for example, we can always add an option of multiple ones down the line.
something like this?
type AlertSeverities struct {
Critical string
HighFast string
HighSlow string
Low string
NoSlo string
}
alertingTools := []AlertingTool{
{
Name: "opsgenie",
SeverityMap: map[string]string{
"critical": "P1",
"high-fast": "P2",
"high-slow": "P3",
"low": "P4",
"no-slo": "P5",
},
},
{
Name: "pagerduty",
SeverityMap: map[string]string{
"critical": "SEV-1",
"high-fast": "SEV-2",
"high-slow": "SEV-3",
"low": "SEV-4",
"no-slo": "SEV-5",
},
},
}
Eh, I mean I guess we could do that :) Alternatively, we could just set this in the internal/config/config.go
Yes, I copy-pasted it from that file :D
Yeah, makes sense. Just not sure if we want to map these out directly.
Alternatively, we could have some "custom" type for those not using either of the ones we support (like opsgenie and pagerduty from the above example)
we can allow configuring it through the env vars, but it would not cover the cases where there is a need for multiple alerting tools for one triggered alert.
I wouldn't cover that for now, tbh
For example, in automatically generated rules, we already create two different severities in one alerting PrometheusRule
How to decide on what those should be? Opsgenie expects P1 - P4, PagerDuty might expect something else