tokio-rs / tokio

A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...
https://tokio.rs
MIT License
26.99k stars 2.49k forks source link

The H2 log algorithm for the poll time histogram is too granular for prometheus export #6960

Open drbrain opened 2 days ago

drbrain commented 2 days ago

In #6896/#6897 the old log algorithm for recording poll time metrics was replaced by the H2 log algorithm. I have not been able to configure this histogram to produce a very limited number of buckets that also cover a reasonable range that I can export to prometheus.

With the legacy algorithm I had this configuration:

rt.enable_metrics_poll_count_histogram()
    .metrics_poll_count_histogram_scale(tokio::runtime::HistogramScale::Log)
    .metrics_poll_count_histogram_resolution(Duration::from_micros(10))
    .metrics_poll_count_histogram_buckets(7);

Which gave me these buckets when exported to prometheus:

0.000016384
0.000032768
0.000065536
0.000131072
0.000262144
0.000524288
0.002097152
0.004194304
+Inf

Distribution of poll times in these buckets was reasonable for the daily traffic pattern for our application.

With the new algorithm I'm unable to replicate the same range (~10µs–~4ms) with as few buckets. The best I've been able to configure is 17 buckets which is too much cardinality for prometheus when combined with the total number of prometheus labels full deployment of the application uses.

To fix this I'd like the legacy algorithm to be un-deprecated and renamed appropriately.

The alternatives are:

drbrain commented 2 days ago

I suppose a third alternative would be to allow arbitrary buckets that are determined outside tokio

Darksonn commented 1 day ago

cc @hds @rcoh

rcoh commented 1 day ago

Rather than the label per bucket, you could re export the histogram by computing summaries on the client and exporting that.

That's how metric.rs treats histograms in any case.

I'll explore if there is a way to have the h2 histogram have an equivalent setting as the old one or if not, expose a non deprecated way to get the old behavior.

wo401555661 commented 1 day ago

That is a smart idea

rcoh commented 8 hours ago

I'm working on #6963 which, among other things, will enable setting p=0 — there is no inherent reason to restrict it to >=2. Here's what buckets will look like in that case:

bucket 0: 0ns..16.384µs (size: 16.384µs)
bucket 1: 16.384µs..32.768µs (size: 16.384µs)
bucket 2: 32.768µs..65.536µs (size: 32.768µs)
bucket 3: 65.536µs..131.072µs (size: 65.536µs)
bucket 4: 131.072µs..262.144µs (size: 131.072µs)
bucket 5: 262.144µs..524.288µs (size: 262.144µs)
bucket 6: 524.288µs..1.048576ms (size: 524.288µs)
bucket 7: 1.048576ms..2.097152ms (size: 1.048576ms)
bucket 8: 2.097152ms..4.194304ms (size: 2.097152ms)
bucket 9: 4.194304ms..18446744073.709551615s (size: 18446744073.705357311s)