influxdata / flux

Flux is a lightweight scripting language for querying databases (like InfluxDB) and working with data. It's part of InfluxDB 1.7 and 2.0, but can be run independently of those.
https://influxdata.com
MIT License
760 stars 152 forks source link

The histogramQuantile function returns an incorrect value when there are no observations in the histogram #5415

Closed bw972 closed 1 year ago

bw972 commented 1 year ago

Description

There is an issue with the Flux histogramQuantile() function that causes it to display incorrect results. Specifically, when trying to calculate the quantile for a histogram with no new observations (therefore, a total count of observations in the buckets is zero), the function fails to return the expected result.

Steps to Reproduce

  1. There is a Prometheus histogram in InfluxDB (v2 mapping) that is not currently being incremented due to zero traffic condition.
  2. Try to plot the quantile of the histogram using the histogramQuantile() function.
  3. Notice that the resulting plot defaults the result to the value of the highest bucket boundary in the histogram, even though there have been no observations in this period.

Expected Behaviour

The histogramQuantile() function should return nothing if the total count of the buckets is zero.

Actual Behaviour

The histogramQuantile() function returns the value of the highest bucket. This is because rankIdx will always be the highest bucket, as rank ≥ b.count will always be true if rank := t.spec.Quantile * totalCount is zero and b.count is also zero. The rankIdx switch case then assumes that the quantile lies above the highest upper bounds and returns the highest upper bound as the result.

Possible Solution

Add a return condition to the computeQuantile() function in histogram_quantile.go that checks if the totalCount is zero.

if totalCount == 0 {
    return 0, false, nil
  }
wolffcm commented 1 year ago

Thanks for filing this. Rather than make it produce a zero, I made it produce a null value in this PR, since we can't compute a quantile at all in the absence of any observations: https://github.com/influxdata/flux/pull/5419