Open sed-i opened 6 months ago
We don't think anyone is still relating cos-proxy to telegraf or filebeat. So while this is a bug, we might not need to fix it. We should ask known users if this bug is relevant to them and close it if it does not affect anyone.
Since MetricsEndpointAggregator
is only used in cos-proxy
we should probably do this. we should also checks if the same behavior is happening with nrpe
, if it doesn't we should close this ticket with no action.
Bug Description
In #409 we changed the following signature:
Note how the new signature expects to take a unit name instead of app name. That makes sense, because group names must be unique within a rules file, otherwise promtool complains.
So iiuc, we need to correct all the usages of that function to pass in a unit name instead of an app name:
https://github.com/canonical/prometheus-k8s-operator/blob/4ca83b670f72f964e60c0ce69bd0f66b47016aaa/lib/charms/prometheus_k8s/v0/prometheus_scrape.py#L1858
https://github.com/canonical/prometheus-k8s-operator/blob/4ca83b670f72f964e60c0ce69bd0f66b47016aaa/lib/charms/prometheus_k8s/v0/prometheus_scrape.py#L2120
https://github.com/canonical/prometheus-k8s-operator/blob/4ca83b670f72f964e60c0ce69bd0f66b47016aaa/lib/charms/prometheus_k8s/v0/prometheus_scrape.py#L2144
To Reproduce
Deploy prom + offer:
Deploy cos-proxy + peripheral charms:
Now,
tg - ub
cp - prom
juju show-unit prom/0
. You'll see that there are two duplicated group names with the same content. Confirm duplication withjuju debug-log -i unit-prom-0 --replay | grep "Validating"
.Environment
Relevant log output
Additional context
Rules files from cos-proxy are named in the prometheus workload container after the upstream unit, for example
telegraf - cos-proxy - prometheus
would be named after telegraf:It's not enough for an alert group to be named differently: each alert definition must be unique on its own within the rule file, regardless of how it is nested under group names. Uniqueness is derived from alert name + alert labels. So the same alertname won't be flagged as duplicated, if the
juju_unit
label in each is different.Convert alert rules in relation data to yaml rule files:
The above would correspond to a single file on disk in the prometheus workload container.