Open bwplotka opened 4 years ago
It is perfectly valid to have a counter without _total in the Prometheus exposition format, though not encouraged.
There is already some validation of metric names. And there is an established way of reporting errors. (They are reported during registration time of a metric, not during construction time. This might not be the most intuitive way, but error handling has organically grown here. It can be rethought for v2, but right now, we cannot really change it in a breaking way.)
Having said that, because the _total
suffix is just a convention, not a requirement, introducing the requirement now would be a breaking change.
Things are a bit different on OpenMetrics. For that, there is #683 (which I still have to complete with my thoughts how to implement this without breaking a lot of current users, but input of others is welcome).
For v2, we can go wild, of course, and we could elect to be more restrictive for naming. For the current world order, I think the approach with promlint
is the right way to test for legal but discouraged metric names.
I'd suggest to move this issue to the v2 milestone as it cannot really be implemented within v1. And I'd also word it as "Consider to…" as it still needs to be discussed if we want to be more restrictive in naming than what the exposition format allows. (Even OpenMetrics doesn't enforce seconds
in the metric name of a histogram of durations.)
Note that there is also #426 which deals with formal validations that are currently not even possible during registration time (but will happen during scrape time).
I adjusted the issue according to my suggestions above.
I guess if something is just a convention, it would be bad if an instrumentation library enforce it. But perhaps we can have an option somewhere in v2 where a user can ask the library to check/enforce certain conventions.
We don't have to decide that now. v2 needs more fundamental design decisions first. (I wanted to turn my Gophercon EU talk into a straw poll about opinions in the Go community, but oh well, the conference got postponed…)
Somehow related: https://github.com/thanos-io/thanos/issues/2430
Hi,
I think it makes sense to have this verification on client-side in official Prometheus library - some rules are must-have like
_total
. I guess the problem is how to error... probably panic is the only option?Alternatively, we can implement it on our (e.g Thanos side). Plus we can be even more opinionated. Some wrapper that will panic if:
_total
suffix_seconds
andNewSizeHistogram
invoked without_bytes
(we only have two types).Alternatively (3rd)
NewCounter
could always add_total
if not specified, but that's surprising. Sometimes we discover metrics from code.Use case: 10000 bugs where we forgot to ensure
_total
of counter suffix in PR review time...