Open bwplotka opened 6 months ago
I tested the equivalent in the standard math/big
library and still can't get it to round up:
big.ParseFloat("0.00500000000000000006", 10, 53, big.ToNearestAway) // 0.005
That value cannot be expressed as a float64 (as most values really can't).
What's fascinating is that the Golang client library does not validate the buckets, and other clients probably don't either. The restrictions are by contract in the options field comments. I wonder if these buckets are being created programmatically because I would be shocked if someone created buckets with that type of precision manually!
Further, how does Prometheus itself deal with these? If Prometheus simply stores these as string-typed labels, you would still be able to query from PromQL.
The Golang client uses [sort.SearchFloat64s](https://pkg.go.dev/sort#SearchFloat64s)
to pick the bucket, which is just binary search. This means that whether it goes into "0.00500000000000000006" or "0.005" will be random for different bucket sets but always consistent, meaning that one of these will be used and one will not! We may be able to take advantage of the fact that one is never used when sending data to CMP and just ignore the unused one.
But, easier said than done.
Or maybe a simpler solution since the user can't send this metric anyway is to just ignore it as invalid and log a warning.
A big usability improvement would be know which metric is causing the problem.
FWIW, my org's software had a bug where it was emitting the same metrics twice and it triggered these errors. Our collectors in GKE are using prometheus:v2.45.3-gmp.5-gke.0 (sha256:aa631b2f63f54f350887000fb9f1fc2cf1a4820652140e458751f27858bdf081)
We got one Cx getting error from the CMP:
One or more TimeSeries could not be written: Field timeSeries[0].points[0].distributionValue had an invalid value: Distribution |explicit_buckets.bounds| entry 1 has a value of 0.005 which is less than the value of entry 0 which is 0.005
. This indicates that either:A. Client exposes classic histogram series with two series for the same
le
value. This is impossible as duplicated series will be handled by scraping logic. B. Our conversion from string le labels to float buckets ended up as the same0.005
buckets. This can totally happen on extreme bucketle
values like"0.00500000000000000006"
and"0.005"
in subsequent classic series. See repro. Furthermore check does not validate for "0" diffs, only negative diffs on it. C. Rounding error no CMP side. Unlikely as CMP accepts double value as well.Action Items