open-telemetry / opentelemetry-go-contrib

Collection of extensions for OpenTelemetry-Go.
https://opentelemetry.io/
Apache License 2.0
1.15k stars 545 forks source link

Obtaining sampling rules may silently fail #5717

Open jaedle opened 3 months ago

jaedle commented 3 months ago

Description

The X-Ray remote sampler may silently fail obtaining the sampling rules from the awsproxy.

We are running the OpenTelemetry Collector behind a reverse proxy as we have about a dozen services ingesting traces which are forwarded to AWS X-Ray. As this adds forward headers (i.e. x-forwarded-*), the request signing of the awsproxy was not working correctly.

Even though the response of AWS results in a client error, there are no signs of an error. The JSON response body is unmarshaled and treated as empty ruleset.

The missing check on status code of the response should be the problem in this case.

The only response code for a successful request should be 200/OK according the documentation.

Environment

Should not be relevant for the error.

Steps To Reproduce

The simplest way to reproduce may be a test case for client_test.go:

func TestRespondsWithErrorStatusCode(t *testing.T) {
    testServer := httptest.NewServer(http.HandlerFunc(func(res http.ResponseWriter, _ *http.Request) {
        res.WriteHeader(http.StatusForbidden)
        _, err := res.Write([]byte(`{}`))
        require.NoError(t, err)
    }))
    t.Cleanup(testServer.Close)

    u, err := url.Parse(testServer.URL)
    require.NoError(t, err)

    client, err := newClient(*u)
    require.NoError(t, err)

    samplingRules, err := client.getSamplingRules(context.Background())
    require.Error(t, err)
    require.Nil(t, samplingRules)
}

Current behavior

GIVEN sampling rules should be obtained from the xray proxy
WHEN the response code incidcates an error 4xx/5xx
THEN an empty manifest is created
AND no error is indicated

Expected behavior

GIVEN sampling rules should be obtained from the xray proxy
WHEN the response code incidcates an error 4xx/5xx
THEN an error should occur on loading the rulset
AND no manifest should be present 

I would love to provide a PR to fix that issue.

jaedle commented 3 months ago

I have already seen #5554 and hope that @wangzlei could jump in reviewing the PR I would love to raise. :pray:

ayushrakesh commented 3 months ago

@jaedle Is this issue still need to fix or already solved?

jaedle commented 3 months ago

@jaedle Is this issue still need to fix or already solved?

Still open, see #5718.

ayushrakesh commented 3 months ago

@jaedle Thanks , I am starting working on it.

jaedle commented 3 months ago

@ayushrakesh I am sorry if you got this wrong, but did you have a look at the existing PR #5718?

ayushrakesh commented 3 months ago

@jaedle Oh, there is already a PR. Sorry.

jaedle commented 3 months ago

Ok 👌

May I ask why you created a new PR?

Having a look at your changes those look pretty similar to the existing PR which was already reviewed.

ayushrakesh commented 3 months ago

Yeah, sorry for it, I thought PR is not made yet.

jaedle commented 3 months ago

I am very sorry if I was unclear 🙁

ayushrakesh commented 3 months ago

No no, that's my fault.🥲