akoutmos / prom_ex

An Elixir Prometheus metrics collection library built on top of Telemetry with accompanying Grafana dashboards
MIT License
596 stars 104 forks source link

[Question] Is it possible to attach context to Phoenix metrics? #219

Closed elvanja closed 11 months ago

elvanja commented 11 months ago

The goal here is to be able to extract an average number of requests per certain route per logged in user. The total number per route is already possible, e.g. round(sum(increase(myapp_prom_ex_phoenix_http_requests_total{path="/api/v1/route_of_interest", method="GET", status="200"}[24h]))).

Do I need to write a custom plugin for this, which I expect will just be a rip off of PromEx.Plugins.Phoenix with a user ID in get_conn_tags? Or is there a way to reuse the existing plugin for this?

akoutmos commented 11 months ago

Hey Vanja!

Typically with Prometheus you try and keep the cardinality of the metrics as low+bounded as possible. I.e labels that can have a large number of possible values (typically a system may have hundreds/thousands of user_ids), you do not add them as labels to your metrics as Prometheus will store each unique collection of metrics as its own timeseries. Now, if your application just has tens of users, you can definitely add that label, but you will need to create your own Phoenix Plugin to add that additional label (you can easily fork the Phoenix plugin that I provide out of the box with PromEx to achieve this).

Hope that helps!

PS: Here is a great blog post on label cardinality as it relates to Prometheus: https://www.robustperception.io/cardinality-is-key/

elvanja commented 11 months ago

Not sure if cardinality would be an issue, we are dealing with ~ 200 users. The number is unlikely to go much higher, let's say that it will be 1000 users at most. Or even better, likely never more than 50 active ones, which I am thinking we need to worry about since inactive users will not have any metrics exposed.

That being said, it does make more sense to calculate some average number locally, and just expose *_average metric. That way it is always only one metric and cardinality is not an issue. We don't need per user breakdown anyway.

Thank you for your time and effort! ❤️