jacksontj / promxy

An aggregating proxy to enable HA prometheus
MIT License
1.14k stars 128 forks source link

Native histogram support #637

Open callumj opened 7 months ago

callumj commented 7 months ago

I have some native histograms that I cannot see to read in Promxy:

Running this query returns no results:

histogram_quantile(0.99, sum(rate(zuul_filters_pre_latency_ms{namespace="foo"}[1m])))

I can see the query that gets sent to the backend Prometheus (I have remote read disabled) is this:

sum(rate(zuul_filters_pre_latency_ms{namespace="foo"}[1m]))

Meaning that Promxy is still going to need to support native histograms internally to be able to compute the histogram_quantile function. Is there a way to avoid this and let the server do it all?

jacksontj commented 7 months ago

From my local testing (and previous usage) histogram_quantile seems to work properly. I believe the issue is with your query.

So to walk through this using a publicly-available endpoint; lets look at a basic example&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0&g0.range_input=1h)

In here we see a histogram_quantile over a rate -- which works as expected (we got a value).

If we look at a sum of that same rate)&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0&g0.range_input=1h) (mirroring your query) we get an error saying PromQL warning: bucket label "le" is missing or has a malformed value of "" for metric name "" (1:26) -- this is because the histogram_quantile function expects an le label (which is the bucket that it covers).

So if we adjust the sum to include the by %20by%20(le))&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0&g0.range_input=1h) this works as well.

So I believe your issue is query related not promxy related; but if that is not the case please provide some more context (ideally a query that I can reproduce locally or against a public endpoint).

jacksontj commented 6 months ago

Its been roughly a month with no response; so I'm going to assume the issue was resolved-- if not please feel free to re-open or create a new issue :)

callumj commented 1 month ago

If we look at a sum of that same rate)&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0&g0.range_input=1h) (mirroring your query) we get an error saying PromQL warning: bucket label "le" is missing or has a malformed value of "" for metric name "" (1:26) -- this is because the histogram_quantile function expects an le label (which is the bucket that it covers).

Native histograms no longer le labels in their metrics, they have an entirely new data format (that requires a later Prometheus engine).

It also introduces histogram_count and histogram_avg.

jacksontj commented 3 weeks ago

OIC (ran into this on another issue) -- I missed that there is a new thing called native histogram (I didn't distinguish that as new). As of today these aren't supported in promxy (looks like they are [experimental for now in upstream prom](Source: https://prometheus.io/docs/concepts/metric_types/#histogram). I'll leave this open as a feature request; on the next rebase I can look into adding support -- depending on diff I can either add it or wait for it to "graduate" upstream.