Open De4dGho5t opened 1 week ago
Hi! 👋 There are two errors in the template here:
If an alert is missing the severity label then the severity will be empty. This happens because the template expects one to be present and doesn't set a default if the label is missing.
You are writing all the severities for all alerts in a single Pagerduty incident. For example criticalcriticalcritical
, but Pagerduty only accepts one severity per incident.
thank you for quick answers :
so for Ad.1 I could add some default value to severity: '{{ range .Alerts }}{{ .Labels.severity | toLower }}{{ end }}'
to set always some kind of severity, this is good way to resolve it
but for Ad.2 do you have some suggestion how can I fix it ?
but for Ad.2 do you have some suggestion how can I fix it ?
You should be able to use something like this https://github.com/prometheus/alertmanager/pull/3847#issuecomment-2133108415. You'll need to adapt it a little for Pagerduty, but otherwise it should solve the issue.
I got those erros on some of alerts in alertmanger
ts=2024-06-26T14:45:32.988Z caller=dispatch.go:353 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=3 err="pagerduty-notifications/pagerduty[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 400: Event object is invalid: 'payload.severity' is invalid (must be one of the following: 'critical', 'warning', 'error' or 'info')"
Because log doesn't say which alert are problematic, for compering I will show screenshot from pagerduty and alertmanager UILike you see
CPUThrottlingHigh
andKubePersistentVolumeFillingUp
didn't show up in PDThose are all alerts from alertmanager api: alerts.json Log is saying that severity is invalid, but all alerts have correct values of severity.
Currently I'm using kube-prometheus-stack in version: 60.2.0, so alert manger version is:
quay.io/prometheus/alertmanager:v0.27.0
Alertmanger logs with debug enabled: alertmanager.log
My alertmanager config: