Open bboreham opened 1 year ago
Hm, I don't have a strong opinion, other than making sure that when we adjust the buckets we don't account observations with value > 24 && value <= 25 to wrong bucket (:
The bucket coming from the runtime has an upper exclusive boundary of 25 (or, equivalently, a lower inclusive boundary of 25). An observation at value 25 will therefore end up one bucket "up" from the bucket currently labeled as le="24.999999999999996"
. That won't change by labeling it le="24"
. @bboreham's argument above is that the runtime will never observe fractional values of bytes, so observations of e.g. 24.5 will not happen, and changing the label from le="24.999999999999996"
to le="24"
won't change the count value in the bucket.
Fair 👍🏽
This would be a breaking change for anyone relying on the exact values, but overall a benefit.
I'm curious how we handle such situations... With no data other than my experience using Prometheus so far, histograms are used mostly as just another counter and we often use histogram_quantile(X, sum(rate(...
.
I've seen alerts built on top of a particular bucket for things like HTTP request durations, but never for go metrics.
Would a change like this require client_golang v2?
I've seen alerts built on top of a particular bucket for things like HTTP request durations, but never for go metrics.
Actually, with this same line of thought, does having a bucket called 24.9999 cause any other problems besides not being good to read?
Making it v2 would mean you can link both versions of the library, so we would also need to rename the metrics so both sets can be collected.
My proposal is we don’t. Just change it and assume nobody is affected.
any other problem
It’s much bigger, which has a cost in storage, in remote-write, etc.
I'm in favor, making sure we do some good communication around this change
I looked at
go_gc_heap_allocs_by_size_bytes_bucket
, and found it strange that the buckets includele="24.999999999999996"
andle="64.99999999999999"
.With help from @beorn7 I believe this is because the bucket is defined as "less than 25", "less than 65", etc., modified by
math.NextAfter()
: https://github.com/prometheus/client_golang/blob/a09a1d34cbc74daa8ed70234b99467a30b020a40/prometheus/go_collector_latest.go#L555I propose we take advantage of the knowledge that fractional values do not show up in byte counts, and adjust the buckets as
le="24"
,le="64"
, ...This would be a breaking change for anyone relying on the exact values, but overall a benefit.
I'm using v1.15.1.