pragmaticivan / nestjs-otel

OpenTelemetry (Tracing + Metrics) module for Nest framework (node.js) 🔭
Apache License 2.0
535 stars 49 forks source link

Latency buckets are not matching the expected time unit in milliseconds #449

Closed djluis closed 1 year ago

djluis commented 1 year ago

The metric that stores the http latency http.server.duration is storing the units in seconds, because of the division by 1000 as shown by the following code: https://github.com/pragmaticivan/nestjs-otel/blob/b599f90ae512010297255927615c04f416598949/src/middleware/api-metrics.middleware.ts#L121

So for example: 50 ms / 1000 = 0.05 150 ms / 1000 = 0.15 500 ms / 1000 = 0.5

We are expecting the bucket size too be distributed in seconds too, however the buckets are distributed in milliseconds as shown by the open telemetry documentation, and that means every value under 5 seconds will be stored inside the first bucket due to mismatch of unit type between seconds and milliseconds.

The OpenTelemetry agents have a default set of buckets defined. For e.g. the Java agent uses these buckets by default: 0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1_000, 2_500, 5_000, 7_500, 10_000, ∞. These values correspond to measurements in milliseconds when used for latency.

Can you please advice an workaround for this?

pragmaticivan commented 1 year ago

Hey @djluis it used to be seconds before, and I think that's a bug wit the latest refactoring to comply with open telemetry specs

pragmaticivan commented 1 year ago

responseTime((req, res, time) => {

^ time already comes in milliseconds.

djluis commented 1 year ago

responseTime((req, res, time) => {

^ time already comes in milliseconds.

Thanks a lot :) for the fix (that was fast) 👍 👍 desejo de um excelente dia.

pragmaticivan commented 1 year ago

Released :D https://github.com/pragmaticivan/nestjs-otel/releases/tag/v5.1.5