Open awoods187 opened 4 years ago
Doesn’t Prometheus already add a label to indicate the service from which a metric was scraped?
I’d love to better understand the surrounding context. I don’t know if such prefixes are actually common or standard. It also seems like a non-starter to change our metric names at this point. Perhaps we could offer a flag to optionally prefix the metric names. Unfortunately it would break all of our published dashboards.
In my limited experience, these prefixes are near-universal - many client libraries call them "namespaces" and require them. See https://prometheus.io/docs/practices/naming/.
Doesn’t Prometheus already add a label to indicate the service from which a metric was scraped?
I haven't run a Prometheus server of my own in a while, but my recollection is that job labels are part of the scraping configuration - it's the operator's responsibility to assign meaningful job names to pools of scraped instances. As more cloud-hosted, Prometheus-compatible systems emerge, it might be safer not to rely exclusively on careful operator configuration.
It also seems like a non-starter to change our metric names at this point. Perhaps we could offer a flag to optionally prefix the metric names. Unfortunately it would break all of our published dashboards.
Changing metric names will break backward-compatibility, but it's certainly possible for CockroachDB to add a service: cockroach
tag (or something similar) to all exported metrics.
That's all fine (original requester here!) but in the Prometheus "UI", your metrics are all over the place and very hard to identify. Adding a tag isn't going to fix that, because you'd have to inspect every metric to see if it has the tag.
You can see from the attached screenshots how Prometheus groups its own metrics (prometheus and node) whereas Cockroach metrics are first, last and everywhere in-between.
In fact, it occurs to me that some of the metrics in the node and prometheus groups could actually be Cockroach metrics and I'd never know!
I’d love to better understand the surrounding context. I don’t know if such prefixes are actually common or standard. It also seems like a non-starter to change our metric names at this point. Perhaps we could offer a flag to optionally prefix the metric names. Unfortunately it would break all of our published dashboards.
Grafana Dashboards or what? That's just JSON, isn't it? Quick bit of sed and Bob's your aunty's live-in lover.
As someone who was part of an (un)civil war about naming conventions, I can assure you that they are standard and, it turns out, quite useful.
Grafana Dashboards or what? That's just JSON, isn't it? Quick bit of sed and Bob's your aunty's live-in lover.
I’m not concerned about the dashboards for new deployments, I’m concerned about breaking dashboards and alerting for pre-existing clusters when they upgrade. We’ll need to be careful about the migration.
My proposal is that we add a flag and environment variable which will allow users to add a prefix to all metrics. That way we won’t break existing deployments but we’ll have a viable workaround for new deployments. Thoughts?
Sounds good to me.
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
This remains a thing.
A slack user requested:
Jira issue: CRDB-5250