Open dstathis opened 11 months ago
First, we should validate the scrape jobs with promtool
; if we find one that's malformed, we should set the charm to Blocked
. We don't want to stop Prometheus, because having Blocked
is better than an outage.
If schema validation fails for the scrape jobs coming from one relation, we omit those scrape jobs from the final configuration, and we set the charm to Blocked
.
We need to add the same behavior in Grafana Agent, because that can also scrape metrics. We should probably have some helper function in the Prometheus library to handle that.
Bug Description
When a certain set of scrape jobs are deployed, Our scrape job validation is "fooled" and the scrape jobs are written to disk causing Prometheus to fail.
To Reproduce
Deploy the attached bundle and relate to cos. (adjust saas section as needed) machine_model_bundle.txt Here is the charm used in the bundle in case the branch goes away. (remove
.txt
file extension) grafana-agent_ubuntu-22.04-amd64.charm.txtEnvironment
Relevant log output
Additional context
No response