open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.96k stars 2.29k forks source link

[translator/prometheus] Label translation breaks for labels with trailing `__` #30760

Closed bryan-aguilar closed 8 months ago

bryan-aguilar commented 8 months ago

Component(s)

No response

What happened?

Description

Prometheus label normalization does not work properly for labels with trailing double underscores such as __replica__.

Steps to Reproduce

func TestSanitize(t *testing.T) {

    defer testutil.SetFeatureGateForTest(t, dropSanitizationGate, false)()
    assert.Equal(t, "__test__", NormalizeLabel("__test__))"))
    assert.Equal(t, "test__", NormalizeLabel("test__))"))
}

Expected Result

Passing test

Actual Result

--- FAIL: TestSanitize (0.00s)
    /Users/bryaag/repos/opentelemetry-collector-contrib/pkg/translator/prometheus/normalize_label_test.go:18: 
            Error Trace:    /Users/bryaag/repos/opentelemetry-collector-contrib/pkg/translator/prometheus/normalize_label_test.go:18
            Error:          Not equal: 
                            expected: "__test__"
                            actual  : "__test____"

                            Diff:
                            --- Expected
                            +++ Actual
                            @@ -1 +1 @@
                            -__test__
                            +__test____
            Test:           TestSanitize
    /Users/bryaag/repos/opentelemetry-collector-contrib/pkg/translator/prometheus/normalize_label_test.go:19: 
            Error Trace:    /Users/bryaag/repos/opentelemetry-collector-contrib/pkg/translator/prometheus/normalize_label_test.go:19
            Error:          Not equal: 
                            expected: "test__"
                            actual  : "test____"

                            Diff:
                            --- Expected
                            +++ Actual
                            @@ -1 +1 @@
                            -test__
                            +test____
            Test:           TestSanitize
FAIL
FAIL    github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/prometheus 0.476s
FAIL

Collector version

latest

Environment information

Environment

Compiler(if manually compiled): (e.g., "go 14.2") go 1.21.6

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

I did a quick exploration of this and the root cause did not jump out. I suspect it could be an issue with Map() not performing as expected. See top comment inside function declaration.

github-actions[bot] commented 8 months ago

Pinging code owners for pkg/translator/prometheus: @dashpole @bertysentry. See Adding Labels via Comments if you do not have permissions to add labels yourself.

dashpole commented 8 months ago

Interesting. So it is adding an extra underscore?

bryan-aguilar commented 8 months ago

I think it may be adding two extra. I can confirm in a bit.

jeromeinsf commented 8 months ago

actual : "__test____" looks like 2 extra

bertysentry commented 8 months ago

I really don't see how this is possible, given how the code works: replace every non-alphanumeric rune with an underscore. It's pretty basic. How can it append 2 extra underscore?

bryan-aguilar commented 8 months ago

I don't know either....I was stumped. The comment from strings.Map() is a little concerning though.

    // In the worst case, the string can grow when mapped, making
    // things unpleasant. But it's so rare we barge in assuming it's
    // fine. It could also shrink but that falls out naturally.
bryan-aguilar commented 8 months ago

If someone can sanity check our implementation that would be great. Although I think @bertysentry just did. This may be worth of an issue against go. I'll see if I can get a reproducible example in the playground.

bryan-aguilar commented 8 months ago

Well, now I am even more stumped. Minimal playground reproduction shows that it working as intended.

bryan-aguilar commented 8 months ago

@bertysentry can you reproduce this unit test in your local environment to verify?

bryan-aguilar commented 8 months ago
go version
go version go1.21.6 darwin/arm64
bryan-aguilar commented 8 months ago

A coworker ran on amd64 and the tests did not fail. My employer has access to some extra compute resources so I wanted to verify this test on a separate arm64 machine.

OS: Amazon Linux 2 aarch64 5.10 Kernel Host: m6g.2xlarge

go version
go version go1.21.6 linux/arm64

test results

--- FAIL: TestSanitize (0.00s)
    normalize_label_test.go:18: 
                Error Trace:    /home/bryaag/repos/opentelemetry-collector-contrib/pkg/translator/prometheus/normalize_label_test.go:18
                Error:          Not equal: 
                                expected: "__test__"
                                actual  : "__test____"

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1 @@
                                -__test__
                                +__test____
                Test:           TestSanitize
    normalize_label_test.go:19: 
                Error Trace:    /home/bryaag/repos/opentelemetry-collector-contrib/pkg/translator/prometheus/normalize_label_test.go:19
                Error:          Not equal: 
                                expected: "test__"
                                actual  : "test____"

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1 @@
                                -test__
                                +test____
                Test:           TestSanitize
FAIL
FAIL    github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/prometheus     0.006s
FAIL

diff

index f001839097..a80f26e4a0 100644
--- a/pkg/translator/prometheus/normalize_label_test.go
+++ b/pkg/translator/prometheus/normalize_label_test.go
@@ -6,6 +6,7 @@ package prometheus // import "github.com/open-telemetry/opentelemetry-collector-
 import (
        "testing"

+       "github.com/stretchr/testify/assert"
        "github.com/stretchr/testify/require"

        "github.com/open-telemetry/opentelemetry-collector-contrib/internal/common/testutil"
@@ -14,13 +15,8 @@ import (
 func TestSanitize(t *testing.T) {

        defer testutil.SetFeatureGateForTest(t, dropSanitizationGate, false)()
-
-       require.Equal(t, "", NormalizeLabel(""), "")
-       require.Equal(t, "key_test", NormalizeLabel("_test"))
-       require.Equal(t, "key_0test", NormalizeLabel("0test"))
-       require.Equal(t, "test", NormalizeLabel("test"))
-       require.Equal(t, "test__", NormalizeLabel("test_/"))
-       require.Equal(t, "__test", NormalizeLabel("__test"))
+       assert.Equal(t, "__test__", NormalizeLabel("__test__))"))
+       assert.Equal(t, "test__", NormalizeLabel("test__))"))
 }

 func TestSanitizeDropSanitization(t *testing.T) {
@@ -32,4 +28,5 @@ func TestSanitizeDropSanitization(t *testing.T) {
        require.Equal(t, "key_0test", NormalizeLabel("0test"))
        require.Equal(t, "test", NormalizeLabel("test"))
        require.Equal(t, "__test", NormalizeLabel("__test"))
+       require.Equal(t, "__test__", NormalizeLabel("__test__"))
 }
bertysentry commented 8 months ago

I tested this successfully in my environment (Go 1.21.6 on Windows):

    require.Equal(t, "test__", NormalizeLabel("test__"))
    require.Equal(t, "__test__", NormalizeLabel("__test__"))

Result:

=== RUN   TestSanitize
--- PASS: TestSanitize (0.00s)
PASS
ok      github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/prometheus     0.149s
bryan-aguilar commented 8 months ago

I tried running the playground example linked above on my local machine and did not see the error. I am back to being stumped on why this is happening.

bryan-aguilar commented 8 months ago

Welp, sorry for wasting everyones time. The test was faulty and we didn't catch it.

assert.Equal(t, "__test__", NormalizeLabel("__test__))"))
assert.Equal(t, "test__", NormalizeLabel("test__))"))

Notice the two extra parens within the doublequote.

Should be

assert.Equal(t, "__test__", NormalizeLabel("__test__"))
assert.Equal(t, "test__", NormalizeLabel("test__"))
bertysentry commented 8 months ago

"EVERYBODY, CALM DOWN!!" 😂