Open tobz opened 3 days ago
In my mind, the most reasonable design is one that involves adding a new variant to MetricValues
, called Histogram
, that holds a DDSketch
. I believe this is the most reasonable design for a few reasons:
DDSketch
can give us all of the default aggregations (min, max, sum, count) and default percentile(s) (p95) needed for histograms, rather than having to write new/more code to do so over an array of f64
samples
Context
Currently, our DogStatsD parser, and indeed Saluki's event model, only allow for the possibility of storing distributions: a histogram-like data structure (in the mathematical sense) that can answer arbitrary quantile queries against a set of samples. This happens for all histogram-like input types: timers, histograms, and distributions. We did this because it was a reasonable approximation of the average behavior of these metric types: be able to answer questions about the distribution of values observed.
However, in the Datadog Agent, histograms are treated separately (by default) where pre-determined aggregations/quantiles are calcuated and emitted as individual metrics, rather than sending the entire distribution to the Datadog backend for generic querying.
We need to be able to match this behavior such that we can emit histograms in the same way for compatibility with the Datadog Agent.