open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.75k stars 2.18k forks source link

[processor/transform] Add Function to convert Exponential Histograms to normal Histograms #33827

Open daidokoro opened 2 weeks ago

daidokoro commented 2 weeks ago

Component(s)

processor/transform

Is your feature request related to a problem? Please describe.

The Coralogix platform presently does not support ingesting metrics in the form of Exponential Histograms. We have clients currently facing this limitation while ingesting metrics from receivers that specifically only support generating Exponential Histograms. For example, the statsdreceiver

Describe the solution you'd like

We have created a solution which adds a custom conversion function to the transform processor, which handles converting exponential histograms to normal histograms.

A brief description of the key features of this function:

Describe alternatives you've considered

We considered addressing the issue in the statsdreceiver and potentially add support for normal histograms there, however, this would only fix the issue for one receiver.

Having a dedicated function in the transform processor allows us to mitigate the issue for *all receivers and external metric sources.

Additional context

We've created a PR for this potential change: #33824

github-actions[bot] commented 2 weeks ago

Pinging code owners:

kentquirk commented 2 weeks ago

I think some people will find this very useful, although I think it should be covered in warnings that the conversion is lossy and should only be used when there is no alternative. The results will not be identical.

I took a quick look at the draft PR, and it seems plausible but it needs:

crobert-1 commented 2 weeks ago

Removing needs triage based on code owner's response.

daidokoro commented 2 weeks ago

Hey @kentquirk

Thanks for your response and for having an initial look at the draft.

I'm currently working on adding more testing cases. I've also updated the transform processor README.md in the draft to reflect your recommendations for adding a usage warning.

To clarify the approach/algorithm:

Buckets are calculated based on a combination of the Explicit Boundaries that are passed to the function and the upper boundary of each exponential bucket.

calculateBucketCounts function calculates the bucket counts for a given exponential histogram data point. The algorithm is inspired by the logExponentialHistogramDataPoints function used to Print Exponential Histograms in Otel.

At this point we know that the upper bound represents the highest value that can be in this bucket, so we take the upper bound and compare it to each of the explicit boundaries provided by the user until we find a boundary that fits, that is, the first instance where upper bound <= explicit boundary.

For eg.

If we have an explicit boundary of [0, 10, 20, 30] and an upper bound of 11, the count would be added to the explicit bound at 20, as it is the 1st value in which the upper bound is <= a given explicit boundary.

Technically, the explicit values of the histogram are never known in this conversion, we only calculate the upper boundaries and use them to determine the bucket based on the Explicit Boundaries defined by the user.

If the user provides Explicit Boundaries that do not fit the datapoints, this will result in imprecise conversions.

daidokoro commented 1 week ago

/label processor/transform needs-triage

Hey @kentquirk,

I've done the following:

The PR has been set to active from draft.

Let me know if anything else is required.

Thanks.

github-actions[bot] commented 1 week ago

Pinging code owners for processor/transform: @TylerHelmuth @kentquirk @bogdandrutu @evan-bradley. See Adding Labels via Comments if you do not have permissions to add labels yourself.