onflow / flow-evm-gateway

FlowEVM Gateway implements an Ethereum-equivalent JSON-RPC API for EVM clients to use
https://developers.flow.com/evm/about
Apache License 2.0
11 stars 9 forks source link

Histogram metrics emitted for api_request_duration_seconds are not correct #441

Closed franklywatson closed 1 month ago

franklywatson commented 1 month ago

Problem

There are one or more things a bit off with the way the api_request_duration metric summary is being calculated/handled which results in metric histogram values that block creation of monitoring graphs for this metric.

The first issue is there is possibly a rounding issue with the metric value calculation. When you look at the Prometheus.DefBuckets definition which we apply to the metric we would expect a metric datapoint with a value of 0.052 seconds and the le tag for that metric datapoint to be 0.01. In the first screenshot the metrics have various smaller le tags but all of them have a 0 value, suggesting that values <1 are being rounded to 0.

The next issue I haven't quite figured out but relates to the second screenshot. The two metrics shown there have values of 39 and 18, meaning that they should have been bucketed into the 100 le tag. Why they are showing as Infinity isn't clear to me. The docs state that Infinity is reserved for values outside the specified bucket ranges (so should apply only to values >100).. however for some reason that's not happening.

Examples

Screenshot 2024-08-15 at 1 19 55 PM Screenshot 2024-08-15 at 1 28 07 PM metrics.json

Acceptance Criteria

Context

Reference: https://medium.com/mercari-engineering/have-you-been-using-histogram-metrics-correctly-730c9547a7a9