open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/docs/languages/go
Apache License 2.0
5.35k stars 1.09k forks source link

Prometheus - Exemplar label name is invalid when filtering attributes in a custom view #5936

Closed StarpTech closed 1 week ago

StarpTech commented 1 month ago

Description

After upgrading from go.opentelemetry.io/otel/exporters/prometheus v0.50.0 to go.opentelemetry.io/otel/exporters/prometheus v0.53.0 I see the following error exemplar label name \"wg.operation.hash\" is invalid.

We use a custom view to filter metric attributes with high cardinality. While the metric is definitely dropped in the view it seems the data is still exported on the prometheus exporter.

It is necessary to checkout https://github.com/open-telemetry/opentelemetry-go/commit/664a075380b9eb155512adb1213ca688bd1ed611 to not run into a different exemplar prometheus issue.

var opts []sdkmetric.Option

    // Exclude attributes from metrics

    attributeFilter := func(value attribute.KeyValue) bool {
        if isKeyInSlice(value.Key, defaultExcludedOtelKeys) {
            return false
        }
        name := sanitizeName(string(value.Key))
        for _, re := range c.Prometheus.ExcludeMetricLabels {
            if re.MatchString(name) {
                return false
            }
        }
        return true
    }

    msBucketHistogram := sdkmetric.AggregationExplicitBucketHistogram{
        Boundaries: msBucketsBounds,
    }
    bytesBucketHistogram := sdkmetric.AggregationExplicitBucketHistogram{
        Boundaries: bytesBucketBounds,
    }

    var view sdkmetric.View = func(i sdkmetric.Instrument) (sdkmetric.Stream, bool) {
        // In a custom View function, we need to explicitly copy the name, description, and unit.
        s := sdkmetric.Stream{Name: i.Name, Description: i.Description, Unit: i.Unit}

        // Use different histogram buckets for PrometheusConfig
        if i.Unit == unitBytes && i.Kind == sdkmetric.InstrumentKindHistogram {
            s.Aggregation = bytesBucketHistogram
        } else if i.Unit == unitMilliseconds && i.Kind == sdkmetric.InstrumentKindHistogram {
            s.Aggregation = msBucketHistogram
        }

        // Filter out metrics that match the excludeMetrics regexes
        for _, re := range c.Prometheus.ExcludeMetrics {
            promName := sanitizeName(i.Name)
            if re.MatchString(promName) {
                // Drop the metric
                s.Aggregation = sdkmetric.AggregationDrop{}
                return s, true
            }
        }

        // Filter out attributes that match the excludeMetricAttributes regexes
        s.AttributeFilter = attributeFilter

        return s, true
    }

    opts = append(opts, sdkmetric.WithView(view))

Stacktrace of the OTEL error

16:34:58 PM ERROR trace/meter.go:239 otel error {"hostname": "dustins-MacBook-Pro.local", "pid": 79745, "component": "@wundergraph/router", "service_version": "dev", "error": "exemplar label name \"wg.operation.hash\" is invalid"}
github.com/wundergraph/cosmo/router/pkg/trace.NewTracerProvider.func3
        /Users/starptech/p/wundergraph/cosmo/router/pkg/trace/meter.go:239
go.opentelemetry.io/otel.ErrorHandlerFunc.Handle
        /Users/starptech/go/pkg/mod/go.opentelemetry.io/otel@v1.31.1-0.20241030054014-4f94b1e661e7/error_handler.go:26
go.opentelemetry.io/otel.Handle
        /Users/starptech/go/pkg/mod/go.opentelemetry.io/otel@v1.31.1-0.20241030054014-4f94b1e661e7/handler.go:33
go.opentelemetry.io/otel/exporters/prometheus.addExemplars[...]
        /Users/starptech/go/pkg/mod/go.opentelemetry.io/otel/exporters/prometheus@v0.53.1-0.20241030054014-4f94b1e661e7/exporter.go:535
go.opentelemetry.io/otel/exporters/prometheus.addHistogramMetric[...]
        /Users/starptech/go/pkg/mod/go.opentelemetry.io/otel/exporters/prometheus@v0.53.1-0.20241030054014-4f94b1e661e7/exporter.go:260
go.opentelemetry.io/otel/exporters/prometheus.(*collector).Collect
        /Users/starptech/go/pkg/mod/go.opentelemetry.io/otel/exporters/prometheus@v0.53.1-0.20241030054014-4f94b1e661e7/exporter.go:229
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1
        /Users/starptech/go/pkg/mod/github.com/prometheus/client_golang@v1.20.5/prometheus/registry.go:458

Environment

Expected behavior

It should be possible to filter attributes like before including prometheus exporter support

StarpTech commented 1 month ago

Workaround. Disabling exemplars for all

sdkmetric.WithExemplarFilter(exemplar.AlwaysOffFilter)
MrAlias commented 1 week ago

cc @dashpole

dashpole commented 1 week ago

Related: https://github.com/prometheus/prometheus/issues/15259

To fix this, we need to sanitize exemplar keys here: https://github.com/open-telemetry/opentelemetry-go/blob/99c3c661e0ec3e484f9faf7bc7aa23f8c7b7f1d6/exporters/prometheus/exporter.go#L550

We should use the same approach used for label keys: https://github.com/open-telemetry/opentelemetry-go/blob/99c3c661e0ec3e484f9faf7bc7aa23f8c7b7f1d6/exporters/prometheus/exporter.go#L335

dashpole commented 1 week ago

Fix out: #5995. @StarpTech are you able to verify that it fixes your issue?

StarpTech commented 1 week ago

@dashpole I'm gonna give it a try next week.