grafana / cortex-jsonnet

Deprecated: see https://github.com/grafana/mimir/tree/main/operations/mimir instead
Apache License 2.0
74 stars 55 forks source link

Improved alert messages with Cortex cluster #351

Closed pracucci closed 3 years ago

pracucci commented 3 years ago

What this PR does: We're not using a consistent way to reference a Cortex cluster in alert messages. Sometimes we use the job label, sometimes the namespace other times we don't reference it at all. However, the mixin supports to configure the labels used to "group metrics by cluster" via cluster_labels (or the deprecated alert_aggregation_labels).

In this PR I'm defining alert_aggregation_variables (based on cluster_labels/alert_aggregation_labels) containing the Prometheus templating variables for the labels used to group by cluster and use it in alert messages. For example, the diff for an alert is:

Screenshot 2021-07-02 at 16 39 51

I've checked the whole diff in our infra and should be good.

Which issue(s) this PR fixes: N/A

Checklist

pracucci commented 3 years ago

Were all those % broken before?

No. If you enable the string interpolation in jsonnet then you have to escape % as %%. I've manually checked the compiled output for all messages and should be good 🤞