prometheus / prometheus

The Prometheus monitoring system and time series database.
https://prometheus.io/
Apache License 2.0
55.82k stars 9.16k forks source link

Configure the EndsAt Value in an Alert #14830

Open mmikitka-arcticwolf opened 2 months ago

mmikitka-arcticwolf commented 2 months ago

Proposal

High-level request: Configure the EndsAt property of an Alert via a field on a PrometheusRule K8s resource to allow for removing or overriding the EndsAt value. If the EndsAt value is not set, then the Alertmanager --resolve_timeout will be used.

The EndsAt field of an alert is currently hard-coded to 4x the resend-delay (see code)

juliusv commented 2 months ago

@mmikitka-arcticwolf could you clarify the use case / motivation for this?

mmikitka-arcticwolf commented 6 days ago

@juliusv The "EndsAt" value is coupled with the Alertmanager "--resolve_timeout" parameter, and we need to co-ordinate these values to reduce the chance of premature alert expiration, notably during maintenance windows.

Since the "EndsAt" property is part of a public interface (i.e., the alert payload), and it is preferred to have more flexibility in this field, so that we can better align the Prometheus and Alertmanager configurations