prometheus / client_golang

Prometheus instrumentation library for Go applications
https://pkg.go.dev/github.com/prometheus/client_golang
Apache License 2.0
5.21k stars 1.15k forks source link

Consider (optionally?) enforcing certain metric naming conventions #725

Open bwplotka opened 4 years ago

bwplotka commented 4 years ago

Hi,

I think it makes sense to have this verification on client-side in official Prometheus library - some rules are must-have like _total. I guess the problem is how to error... probably panic is the only option?

Alternatively, we can implement it on our (e.g Thanos side). Plus we can be even more opinionated. Some wrapper that will panic if:

Alternatively (3rd) NewCounter could always add _total if not specified, but that's surprising. Sometimes we discover metrics from code.

Use case: 10000 bugs where we forgot to ensure _total of counter suffix in PR review time...

brian-brazil commented 4 years ago

It is perfectly valid to have a counter without _total in the Prometheus exposition format, though not encouraged.

beorn7 commented 4 years ago

There is already some validation of metric names. And there is an established way of reporting errors. (They are reported during registration time of a metric, not during construction time. This might not be the most intuitive way, but error handling has organically grown here. It can be rethought for v2, but right now, we cannot really change it in a breaking way.)

Having said that, because the _total suffix is just a convention, not a requirement, introducing the requirement now would be a breaking change.

Things are a bit different on OpenMetrics. For that, there is #683 (which I still have to complete with my thoughts how to implement this without breaking a lot of current users, but input of others is welcome).

For v2, we can go wild, of course, and we could elect to be more restrictive for naming. For the current world order, I think the approach with promlint is the right way to test for legal but discouraged metric names.

I'd suggest to move this issue to the v2 milestone as it cannot really be implemented within v1. And I'd also word it as "Consider to…" as it still needs to be discussed if we want to be more restrictive in naming than what the exposition format allows. (Even OpenMetrics doesn't enforce seconds in the metric name of a histogram of durations.)

Note that there is also #426 which deals with formal validations that are currently not even possible during registration time (but will happen during scrape time).

beorn7 commented 4 years ago

I adjusted the issue according to my suggestions above.

I guess if something is just a convention, it would be bad if an instrumentation library enforce it. But perhaps we can have an option somewhere in v2 where a user can ask the library to check/enforce certain conventions.

We don't have to decide that now. v2 needs more fundamental design decisions first. (I wanted to turn my Gophercon EU talk into a straw poll about opinions in the Go community, but oh well, the conference got postponed…)

bwplotka commented 4 years ago

Somehow related: https://github.com/thanos-io/thanos/issues/2430