Closed tetianakravchenko closed 1 year ago
Pinging @elastic/es-analytics-geo (Team:Analytics)
As you can see from the example the first bucket is the counts for values whose value is less then or equal to 0. If I understand correctly the way you decided to handle negative bucket values you should accumulate the counts from the first two buckets so to have
le=0.005, counts= 59997430461 + 2726772
Anyway this should overflow anyway because our histogram type supports double values
and long counts
.
If I understand correctly the way you decided to handle negative bucket values you should accumulate the counts from the first two buckets so to have
not sure why are you referring to the negative bucket values
handling, here is an prometheus histogram metric:
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0"} 5.02948066088585e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.001"} 5.02961865361956e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.003"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.01"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.03"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.1"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.25"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.5"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.75"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="1"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="+Inf"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_sum{phase="waiting",priority_level="workload-low"} 3.3128167345312536e+06
apiserver_flowcontrol_priority_level_request_utilization_count{phase="waiting",priority_level="workload-low"} 5.02962348792884e+14
there is no negative buckets.
you should accumulate the counts from the first two buckets so to have le=0.005, counts= 59997430461 + 2726772
sorry, I didn't understand this, each element of counts
is a deaccumulated value, here is a test file with some explanations how counts are calculated - https://github.com/elastic/beats/blob/243e6c0f3450282bb73067a98a30d5c718274803/x-pack/metricbeat/module/prometheus/collector/histogram_test.go#L392-L447
Anyway this should overflow anyway because our histogram type supports double values and long counts.
so the documentation is not correct saying that
A corresponding counts array of [integer](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) numbers, representing how many values fall into each bucket
?
Hey @kkrik-es
is there any reason to not use an unsigned_long
type for the counts
? type used for the prometheus histogram is uint64
. As I understand counts
must be anyway positive or zero.
I believe we already check programmatically that count
values are not negative.
Hey @kkrik-es
is there any reason to not use an
unsigned_long
type for thecounts
? type used for the prometheus histogram isuint64
. As I understandcounts
must be anyway positive or zero.
Elasticsearch is written in Java...and unfortunately Java does not have unsigned types (unless we do something special to treat signed numbers as unsigned).
Hey @kkrik-es
is there any reason to not use an unsigned_long type for the counts? type used for the prometheus histogram is uint64. As I understand counts must be anyway positive or zero.
Elasticsearch is written in Java...and unfortunately Java does not have unsigned types (unless we do something special to treat signed numbers as unsigned).
I was referring to this page - https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html#number and as I see unsigned_long
is listed as supported numeric field types, so from my understanding there should be implemented something to treat signed numbers as unsigned. Anyway it is more a curiosity question and attempt to align to the prometheus histogram type
Kindly note that the submitted fix merely addresses the shortcoming of ES histograms with regard to supporting large count values per inserted point. Other incompatibilities with Prom histograms, e.g. around the use of cumulative values and different internal implementation for percentiles, persist.
Prometheus histograms are transformed to the Elasticsearch histograms using
PromHistogramToES
counts
array is composed by undoing counters accumulation for each bucket.But anyway there still might happen situation when difference between two long values doesn’t fit into elasticsearch counts type - int, that cause such error:
and the document will be dropped.
Full message:
the problem is with the first value (I’ve checked other dropped events - there is the same behavior)