Open JamesNK opened 1 year ago
related: #316
I am suggesting: [0, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 30, 60, 120, 300]
in #https://github.com/open-telemetry/opentelemetry-dotnet/issues/4922
It uses an approx 2x escalation for each bucket with alignment to minutes at the end.
It doesn't go up to hours, but the main benefit for longer connections is that you don't need to pay the setup costs of each connection on each request. Once the connection duration is in the order of minutes, the incremental cost of benefit of longer connections rapidly diminishes. This should be a good balance.
Up to 300 seconds is much better than 10 seconds. I think there are situations where a connection could live quite a long time. For example, web sockets in the browser (e.g. SignalR) and server-to-server scenarios where a client is reused for a long time.
I removed some of the smaller values and added capacity for up to an hour.
Before: [0, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 30, 60, 120, 300]
After: [0, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 30, 60, 120, 300, 600, 1200, 3600]
TBH I'm not sure exactly where most connection lifetimes end up. I would be ok with tracking up to 300 seconds and then adjusting if needed.
Update:
ASP.NET Core Kestrel is using: [0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 30, 60, 120, 300]
Is there any update on consideration of this? Particularly when these get converted to Prometheus histograms that do not track min and max, as far as I can tell we can never query the metrics to show request duration longer than 10 seconds.
10 seconds is a shockingly low amount of time for the highest request duration bucket to record.
hi @francoposa, this issue is about connection duration (across multiple requests) as opposed to request duration.
you may want to consider using a metric view to configure longer request duration buckets for your use case
The
http.server.request.duration
histogram recommends bucket sizes: https://github.com/open-telemetry/semantic-conventions/blob/203691d99612452df0c951640b04521e34969628/docs/http/http-metrics.md?plain=1#L67-L68A server library has a histogram to track HTTP connection duration. It should have defined bucket sizes, but I'm are unsure what values to set. The HTTP request durations are too short (a connection could last, minutes, hours or even days).
Is there any agreement in the OTEL ecosystem about what good histogram buckets are for HTTP connection duration? (or longer running tasks in general)