Closed rbarry82 closed 2 years ago
Libraries are not up to date with their remote counterparts. If this was
not intentional, run charmcraft fetch-libs
and commit the updated libs
to your PR branch.
Libraries are not up to date with their remote counterparts. If this was
not intentional, run charmcraft fetch-libs
and commit the updated libs
to your PR branch.
Bumped LIBPATCH
to prep for Grafana agent, but... this also handles deduplication as a bonus and that's scary.
It "handles" it because with a shortened topology identifier so remote write matches up with direct relations, it just clobbers the file here. It's technically correct, which is the worst sort of correct, and it smells like a bug waiting to happen:
local = {'alerts_937d05a2_tester': {'groups': [{'name': 'alerts_937d05a2_tester_cpu_overuse_alerts',
'rules': [{'alert': 'CPUOverUse',
'annotations': {
'description': '{{$labels.instance}} of job{ $labels.job }} '
'has used too much CPU.',
'summary': 'Instance {{ $labels.instance }} CPU overuse'
},
'expr': 'process_cpu_seconds_total{juju_application="tester",juju_charm="prometheus-tester",juju_model="alerts",juju_model_uuid="937d05a2-9412-46af-829b-c99c43fee856"} > 0.12',
'for': '0m',
'labels': {'juju_application': 'tester',
'juju_charm': 'prometheus-tester',
'juju_model': 'alerts',
'juju_model_uuid': '937d05a2-9412-46af-829b-c99c43fee856',
'severity': 'Low'}}]}]}}
remote = {'alerts_937d05a2_tester': {'groups': [{'name': 'alerts_937d05a2-9412-46af-829b-c99c43fee856_grafana-agent_grafana-agent-k8s_alerts_937d05a2_tester_cpu_overuse_alerts_alerts',
'rules': [{'alert': 'CPUOverUse',
'annotations': {
'description': '{{$labels.instance}} of job{ $labels.job }} '
'has used too much CPU.',
'summary': 'Instance {{ $labels.instance }} CPU overuse'
},
'expr': 'process_cpu_seconds_total{juju_application="tester",juju_charm="prometheus-tester",juju_model="alerts",juju_model_uuid="937d05a2-9412-46af-829b-c99c43fee856"} > 0.12',
'for': '0m',
'labels': {'juju_application': 'tester',
'juju_charm': 'prometheus-tester',
'juju_model': 'alerts',
'juju_model_uuid': '937d05a2-9412-46af-829b-c99c43fee856',
'severity': 'Low'}}]}]}}
Are these the same? No, they are not, and won't be until grafana agent gets the new remote write which uses the "short" topology identifier. {identifier}["groups"][0]["name"]
differs. Should the filename/identifier be the same then? Probably not.
We should dedupe, but we should also think through a pattern for not accidentally clobbering files. While I can't think of a set of relations right now which would lead to {juju_model}_{juju_model_uuid_short}_{juju_application}
as a filename with different results, I'm sure someone smarter than me can contrive a path where it is related in a path where a "direct" relation and an indirect relation (cos-config->prometheus-scrape-*->grafana-agent->prometheus, maybe) yields different results with the same identifier.
I want to open a new issue to fix that along with the dedupe, but I cannot think of a way to make it happen, so maybe it's just paranoia and this is actually ok. @balbirthomas @sed-i @dstathis @simskij can one of contrive a way to bust it?
Issue
In moving to
observability_libs.juju_topology
in #297,topology.render
was also removed. The code was stubbed out withCosTool.inject_label_matchers
, but sinceProvider|Consumer
are inverted inprometheus_remote_write
, the label injection never actually happened, which the existing integration/unit tests did not catch.Solution
Use the new
JujuTopology
for the identifier, and add label injection to the "Provider" (Prometheus) side of the library where we know it will be present. Add an integration test since this cannot be reliably detected in unit tests.Release Notes
Fix label injection in prometheus_remote_write