prometheus / client_golang

Prometheus instrumentation library for Go applications
https://pkg.go.dev/github.com/prometheus/client_golang
Apache License 2.0
5.43k stars 1.18k forks source link

Feature: Expose a request path label in the `http_request_*` metric by default #491

Closed zephinzer closed 6 years ago

zephinzer commented 6 years ago

Currently, metrics are exposed as:

http_request_duration_microseconds{handler="users",quantile="0.5"} 401.237
...
http_request_duration_microseconds{handler="users",quantile="0.5"} 652.793
...
http_request_duration_microseconds{handler="users",quantile="0.5"} 652.793

I would like to add on to this by also allowing for a path label to be added so that each route does not need to be manually instrumented with a handler name. I'm proposing that the default collection should result in:

http_request_duration_microseconds{handler="users",path="/users/logout",quantile="0.5"} 401.237
...
http_request_duration_microseconds{handler="users",path="/users/login",quantile="0.5"} 652.793
...
http_request_duration_microseconds{handler="users",path="/users/status",quantile="0.5"} 652.793
...
# other examples I think would be useful
http_requests_total{code="200",handler="users",path="/users/status",method="get"} 3
promhttp_metric_handler_requests_total{path="/some/other/endpoint",code="200"} 2
promhttp_metric_handler_requests_total{path="/another/endpoint",code="500"} 0
promhttp_metric_handler_requests_total{path="/yet/another/endpoint",code="503"} 0

This could be done by adding r.URL.Path to the labels creation as a path label.

I'm new to Golang but would be happy to attempt this with some guidance!

Any opinions?

beorn7 commented 6 years ago

Yeah, this would be pretty straight-forward to add.

IIRC, when implementing the middlewares, we were concerned that it's very easy to create cardinality explosions. @stuartnelson3 do you remember any of such concerns?

stuartnelson3 commented 6 years ago

That was our concern. Some users could be putting user IDs or some other non-bounded set of data in the path, so we decided to avoid potentially destroying our users' prometheus servers.

beorn7 commented 6 years ago

@stuartnelson3 and i discussed this over lunch. Most importantly, my memory served my right: Back then, we deliberately did not include this feature. We indeed had the concerns mentioned above.

In particular, the requested feature only makes sense with handlers that handle multiple paths. (If a handler only handles one path, you can curry the metric vector with a path label before using it with the instrumentation middleware.) But that's exactly the case where you run into the risk of unintentionally creating many different label values. For example, a malicious (or unintentionally broken) client could bombard your services with random URLs and thus create a new time series with each request. To protect against this, one might try to exclude 404s from the metric. But then we are getting fairly specific, which I would see out of scope of the instrumentation middlewares. Those middlewares are meant to be convenience function to handle standard situation in an easy way. If things get specific, it might just be best to instrument in the conventional way, as explained here. This way is not that complicated, so that I would be wary of adding to much power features to the convenience middlewares in promhttp.

Or in yet other words: The convenience middlewares in promhttp are provided so that it is easy to instrument an http.Handler in simple scenarios, even if you don't know a lot about instrumentation. Adding an automatic path label is however an easy way shoot yourself into the foot, especially for naive users. It would be useful for more involved users, but for those, it would also be easy to use the direct instrumentation as described in the linked talk.

Thus, I'm on the fence if that feature would be a net gain.

zephinzer commented 6 years ago

Agree with the potential behaviour of having URLs - URL fuzzing by some common pentesting tools might cause this too. However, (correct me if i'm wrong, but) I believe this can be mitigated during the metrics extraction from Prometheus to a dashboard (Grafana in our case) via promql?

For example in our setup, we are monitoring the duration and status codes by endpoints which we've identified as critical for the business front-end. The exact endpoints tend to change based on business direction and it's useful for us to have a path label so that the monitoring and code is de-coupled.

But to summarise, I think you might be recommending that this be a third party instrumentation package instead of being in the core code? I noticed that such functionality is also not included in the core of the NodeJS but as a promclient (https://github.com/siimon/prom-client) package, so it could be in line with your fundamentals.

beorn7 commented 6 years ago

Prometheus itself would suffer. The TSDB of Prometheus has a very low price per sample (usually between one and two bytes) but a much much higher price per serios (kilobytes). Thus, your server would explode.

I'm not suggesting to use a 3rd party package. All the tools you need are included in client_golang. You should just not use the convenience-middlewares but the usual instrumentation primitives as described in the linked video.

zephinzer commented 6 years ago

cool, thanks for the advice on the potential effects on tsdb @beorn7! I gave it a watch and have managed to get up my own middleware to do something like what I was requesting for, am closing this-

but if anyone else is looking for such a convenience middleware despite the above warnings, I've published it at https://github.com/zephinzer/ezpromhttp!

pavelpatrin commented 1 year ago

I have a case, when to provide request path as a label is enough useful: gRPC Gateway server which is registered by RegisterXxxHandlerServer() call. It is impossible to produce metrics with gRPC middlewares, they are not called.