SuperQ / smokeping_prober

Prometheus style smokeping
Apache License 2.0
575 stars 74 forks source link

added a summary (in addition to the histogram) #32

Closed ConradWood closed 4 years ago

ConradWood commented 4 years ago

This allows for single-line graphs, and answers the question "With which latency were 95% of pings replied to?" It is useful in addition to the histograms, because it shows jitter on networks. Maybe someone also has a usecase where ping latencies are critical, the summary makes it easy to set up alerts for a host (or group of hosts)

SuperQ commented 4 years ago

Summary is unnecessary when you have histogram data.

You can get the same graph with histogram_quantile(0.95, rate(smokeping_response_duration_seconds_bucket[1m])).

ConradWood commented 4 years ago

I respectfully disagree that it is unnecessary. https://prometheus.io/docs/practices/histograms/

especially: "If you use a summary, you control the error in the dimension of φ. If you use a histogram, you control the error in the dimension of the observed value (via choosing the appropriate bucket layout). With a broad distribution, small changes in φ result in large deviations in the observed value. With a sharp distribution, a small interval of observed values covers a large interval of φ."

On Sun, 2020-01-12 at 11:05 -0800, Ben Kochie wrote:

Summary is unnecessary when you have histogram data. You can get the same graph with histogram_quantile(0.95, rate(smokeping_response_duration_seconds_bucket[1m])). — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

SuperQ commented 4 years ago

We have a good broad distribution, and the ability to customize the distribution. Summary is considered an obsolete method in Prometheus. Having both makes no sense and is considered a violation of best practices.