Histogram field type is not compatible with the prometheus histograms

tetianakravchenko commented 1 year ago

Prometheus histograms are transformed to the Elasticsearch histograms using PromHistogramToES

counts array is composed by undoing counters accumulation for each bucket.

But anyway there still might happen situation when difference between two long values doesn’t fit into elasticsearch counts type - int, that cause such error:

{"type":"x_content_parse_exception","reason":"[1:2070] Numeric value (59997430461) out of range of int (-2147483648 - 2147483647)\n

and the document will be dropped.

Full message:

Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Date(2023, time.September, 22, 12, 19, 26, 601000000, time.Local), Meta:{"input_id":"prometheus/metrics-prometheus-635762b7-7446-4dbd-8b2e-0219e661c290","raw_index":"metrics-prometheus.remote_write-default","stream_id":"prometheus/metrics-prometheus.remote_write-635762b7-7446-4dbd-8b2e-0219e661c290"}, Fields:{"agent":{"ephemeral_id":"839cd17d-903e-4b98-b2d9-24f85a96705a","id":"24474773-c1c5-4dc7-ba53-bb66e56726c6","name":"gke-tetianakravchenko-pr-default-pool-0e95df92-btpw","type":"metricbeat","version":"8.9.2"},"cloud":{"account":{"id":"elastic-observability"},"availability_zone":"us-central1-c","instance":{"id":"7106047343117859434","name":"gke-tetianakravchenko-pr-default-pool-0e95df92-btpw"},"machine":{"type":"e2-medium"},"project":{"id":"elastic-observability"},"provider":"gcp","region":"us-central1","service":{"name":"GCE"}},"data_stream":{"dataset":"prometheus.remote_write","namespace":"default","type":"metrics"},"ecs":{"version":"8.0.0"},"elastic_agent":{"id":"24474773-c1c5-4dc7-ba53-bb66e56726c6","snapshot":false,"version":"8.9.2"},"event":{"dataset":"prometheus.remote_write","module":"prometheus"},"host":{"architecture":"x86_64","containerized":false,"hostname":"gke-tetianakravchenko-pr-default-pool-0e95df92-btpw","id":"7cf9ef1a908c8254d6aebf6c7bba6827","ip":["10.128.0.28","fe80::4001:aff:fe80:1c","169.254.123.1","10.104.2.1","fe80::4c1e:a0ff:fe4b:cd4e","10.104.2.1","fe80::982f:49ff:fe2d:c8f","10.104.2.1","fe80::4ef:e7ff:fe90:222e","10.104.2.1","fe80::1003:60ff:feee:1e87","10.104.2.1","fe80::907e:adff:fe38:8d5c","10.104.2.1","fe80::1451:b3ff:fee9:eb00","10.104.2.1","fe80::648c:17ff:fe38:988"],"mac":["02-42-30-42-7A-BB","06-EF-E7-90-22-2E","12-03-60-EE-1E-87","16-51-B3-E9-EB-00","42-01-0A-80-00-1C","4E-1E-A0-4B-CD-4E","66-8C-17-38-09-88","92-7E-AD-38-8D-5C","9A-2F-49-2D-0C-8F"],"name":"gke-tetianakravchenko-pr-default-pool-0e95df92-btpw","os":{"codename":"focal","family":"debian","kernel":"5.15.109+","name":"Ubuntu","platform":"ubuntu","type":"linux","version":"20.04.6 LTS (Focal Fossa)"}},"metricset":{"name":"remote_write"},"orchestrator":{"cluster":{"name":"tetianakravchenko-prometheus-test","url":"https://10.128.0.18"}},"prometheus":{"apiserver_flowcontrol_priority_level_request_utilization":{"histogram":{"counts":[59997430461,2726772,0,0,0,0,0,0,0,0,0],"values":[0,0.0005,0.002,0.006500000000000001,0.019999999999999997,0.065,0.175,0.375,0.625,0.875,1.25]}},"apiserver_flowcontrol_priority_level_request_utilization_count":{"counter":257267325323388},"apiserver_flowcontrol_priority_level_request_utilization_sum":{"counter":3146017.521250039},"labels":{"instance":"10.128.0.18:443","job":"kubernetes-apiservers","phase":"waiting","priority_level":"workload-high"}},"service":{"type":"prometheus"}}, Private:interface {}(nil), TimeSeries:true}, Flags:0x0, Cache:publisher.EventCache{m:mapstr.M(nil)}} (status=400): {"type":"document_parsing_exception","reason":"[1:2059] failed to parse field [prometheus.apiserver_flowcontrol_priority_level_request_utilization.histogram] of type [histogram]","caused_by":{"type":"x_content_parse_exception","reason":"[1:2070] Numeric value (59997430461) out of range of int (-2147483648 - 2147483647)\n at [Source: (byte[])\"{\"agent\":{\"name\":\"gke-tetianakravchenko-pr-default-pool-0e95df92-btpw\",\"id\":\"24474773-c1c5-4dc7-ba53-bb66e56726c6\",\"type\":\"metricbeat\",\"ephemeral_id\":\"839cd17d-903e-4b98-b2d9-24f85a96705a\",\"version\":\"8.9.2\"},\"elastic_agent\":{\"id\":\"24474773-c1c5-4dc7-ba53-bb66e56726c6\",\"version\":\"8.9.2\",\"snapshot\":false},\"cloud\":{\"availability_zone\":\"us-central1-c\",\"instance\":{\"name\":\"gke-tetianakravchenko-pr-default-pool-0e95df92-btpw\",\"id\":\"7106047343117859434\"},\"provider\":\"gcp\",\"service\":{\"name\":\"GCE\"},\"machin\"[truncated 2100 bytes]; line: 1, column: 2070]","caused_by":{"type":"input_coercion_exception","reason":"Numeric value (59997430461) out of range of int (-2147483648 - 2147483647)\n at [Source: (byte[])\"{\"agent\":{\"name\":\"gke-tetianakravchenko-pr-default-pool-0e95df92-btpw\",\"id\":\"24474773-c1c5-4dc7-ba53-bb66e56726c6\",\"type\":\"metricbeat\",\"ephemeral_id\":\"839cd17d-903e-4b98-b2d9-24f85a96705a\",\"version\":\"8.9.2\"},\"elastic_agent\":{\"id\":\"24474773-c1c5-4dc7-ba53-bb66e56726c6\",\"version\":\"8.9.2\",\"snapshot\":false},\"cloud\":{\"availability_zone\":\"us-central1-c\",\"instance\":{\"name\":\"gke-tetianakravchenko-pr-default-pool-0e95df92-btpw\",\"id\":\"7106047343117859434\"},\"provider\":\"gcp\",\"service\":{\"name\":\"GCE\"},\"machin\"[truncated 2100 bytes]; line: 1, column: 2070]"}}}, dropping event!

"prometheus":{
    "apiserver_flowcontrol_priority_level_request_utilization":{
        "histogram":{
            "counts":[
                59997430461,
                2726772,
                0,
                0,
                0,
                0,
                0,
                0,
                0,
                0,
                0
            ],
            "values":[
                0,
                0.0005,
                0.002,
                0.006500000000000001,
                0.019999999999999997,
                0.065,
                0.175,
                0.375,
                0.625,
                0.875,
                1.25
            ]
        }
    }

the problem is with the first value (I’ve checked other dropped events - there is the same behavior)

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-analytics-geo (Team:Analytics)

salvatore-campagna commented 1 year ago

As you can see from the example the first bucket is the counts for values whose value is less then or equal to 0. If I understand correctly the way you decided to handle negative bucket values you should accumulate the counts from the first two buckets so to have

le=0.005, counts= 59997430461 + 2726772

Anyway this should overflow anyway because our histogram type supports double values and long counts.

tetianakravchenko commented 1 year ago

If I understand correctly the way you decided to handle negative bucket values you should accumulate the counts from the first two buckets so to have

not sure why are you referring to the negative bucket values handling, here is an prometheus histogram metric:

apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0"} 5.02948066088585e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.001"} 5.02961865361956e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.003"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.01"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.03"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.1"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.25"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.5"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="0.75"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="1"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_bucket{phase="waiting",priority_level="workload-low",le="+Inf"} 5.02962348792884e+14
apiserver_flowcontrol_priority_level_request_utilization_sum{phase="waiting",priority_level="workload-low"} 3.3128167345312536e+06
apiserver_flowcontrol_priority_level_request_utilization_count{phase="waiting",priority_level="workload-low"} 5.02962348792884e+14

there is no negative buckets.

you should accumulate the counts from the first two buckets so to have le=0.005, counts= 59997430461 + 2726772

sorry, I didn't understand this, each element of counts is a deaccumulated value, here is a test file with some explanations how counts are calculated - https://github.com/elastic/beats/blob/243e6c0f3450282bb73067a98a30d5c718274803/x-pack/metricbeat/module/prometheus/collector/histogram_test.go#L392-L447

Anyway this should overflow anyway because our histogram type supports double values and long counts.

so the documentation is not correct saying that A corresponding counts array of [integer](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) numbers, representing how many values fall into each bucket ?

kkrik-es commented 1 year ago

It seems like count is an int?

https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/index/fielddata/HistogramValue.java#L35

tetianakravchenko commented 1 year ago

Hey @kkrik-es

is there any reason to not use an unsigned_long type for the counts? type used for the prometheus histogram is uint64. As I understand counts must be anyway positive or zero.

kkrik-es commented 1 year ago

I believe we already check programmatically that count values are not negative.

salvatore-campagna commented 1 year ago

Hey @kkrik-es

is there any reason to not use an unsigned_long type for the counts? type used for the prometheus histogram is uint64. As I understand counts must be anyway positive or zero.

Elasticsearch is written in Java...and unfortunately Java does not have unsigned types (unless we do something special to treat signed numbers as unsigned).

tetianakravchenko commented 1 year ago

Hey @kkrik-es

is there any reason to not use an unsigned_long type for the counts? type used for the prometheus histogram is uint64. As I understand counts must be anyway positive or zero.

Elasticsearch is written in Java...and unfortunately Java does not have unsigned types (unless we do something special to treat signed numbers as unsigned).

I was referring to this page - https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html#number and as I see unsigned_long is listed as supported numeric field types, so from my understanding there should be implemented something to treat signed numbers as unsigned. Anyway it is more a curiosity question and attempt to align to the prometheus histogram type

kkrik-es commented 1 year ago

Kindly note that the submitted fix merely addresses the shortcoming of ES histograms with regard to supporting large count values per inserted point. Other incompatibilities with Prom histograms, e.g. around the use of cumulative values and different internal implementation for percentiles, persist.

elastic / elasticsearch

Histogram field type is not compatible with the prometheus histograms #99820