deadtrickster / prometheus.erl

Prometheus.io client in Erlang
MIT License
341 stars 117 forks source link

Floats or integers? #151

Open eproxus opened 1 year ago

eproxus commented 1 year ago

I'm super confused regarding float vs integer values. The Prometheus documentation itself recommends using the base unit seconds for duration (i.e. not milliseconds for example).

When recording duration histogram values in seconds I see very strange values in Prometheus, even though the data I log is quite sensible (i.e. request durations of 1.371114 seconds show up as 1e-09 in the metric http_client_total_duration_seconds_sum).

Are floats even supported in prometheus_histogram? The documentation says:

"Raises {invalid_value, Value, Message} if Value isn't an integer

Histograms seems to accept floats anyway, but I'm not sure it does the right thing with them.

eproxus commented 1 year ago

Ok, this was a rabbit hole. It all boils down to this comment in a code example (!) in the prometheus_histogram module documentation:

%% Time must be in native units, otherwise duration_unit must be false

It turns out that the unit of observer values should not match the histogram bucket unit! Instead it should be "native" (which I can only assume is the Erlang native time unit, but it is not clear).

So, there is some magic that figures out the unit of the metric itself (I guess it checks if it ends with _seconds etc.) and stores this. Then observations has to be made in native unit which the library then converts into the actual unit when formatting.

This makes sense considering that ets:update_counter/3+4 are used under the hood, but this needs to be documented much more prominently in my opinion.