What this PR does:
We're not using a consistent way to reference a Cortex cluster in alert messages. Sometimes we use the job label, sometimes the namespace other times we don't reference it at all. However, the mixin supports to configure the labels used to "group metrics by cluster" via cluster_labels (or the deprecated alert_aggregation_labels).
In this PR I'm defining alert_aggregation_variables (based on cluster_labels/alert_aggregation_labels) containing the Prometheus templating variables for the labels used to group by cluster and use it in alert messages. For example, the diff for an alert is:
I've checked the whole diff in our infra and should be good.
Which issue(s) this PR fixes:
N/A
Checklist
[x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
No. If you enable the string interpolation in jsonnet then you have to escape % as %%. I've manually checked the compiled output for all messages and should be good 🤞
What this PR does: We're not using a consistent way to reference a Cortex cluster in alert messages. Sometimes we use the
job
label, sometimes thenamespace
other times we don't reference it at all. However, the mixin supports to configure the labels used to "group metrics by cluster" viacluster_labels
(or the deprecatedalert_aggregation_labels
).In this PR I'm defining
alert_aggregation_variables
(based oncluster_labels
/alert_aggregation_labels
) containing the Prometheus templating variables for the labels used to group by cluster and use it in alert messages. For example, the diff for an alert is:I've checked the whole diff in our infra and should be good.
Which issue(s) this PR fixes: N/A
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]