open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/docs/languages/go
Apache License 2.0
5.22k stars 1.05k forks source link

BAD API DESIGN: cannot merge resource due to conflicting Schema URL #4476

Closed dustinmoris closed 11 months ago

dustinmoris commented 1 year ago

I am new to OpenTelemetry but one thing I know for sure is that the reason why I use Go is because I don't like weird magic behaviour that look extremely confusing and pose traps to developers making it easy to do the wrong thing and I feel like the current API design is exactly doing that.

I followed the official documentation to initialise an OpenTelemetry resource for GCP here: https://github.com/open-telemetry/opentelemetry-go-contrib/tree/main/detectors/gcp

Code:

ctx := context.Background()
// Detect your resources
res, err := resource.New(ctx,
    // Use the GCP resource detector!
    resource.WithDetectors(gcp.NewDetector()),
    // Keep the default detectors
    resource.WithTelemetrySDK(),
    // Add your own custom attributes to identify your application
    resource.WithAttributes(
        semconv.ServiceNameKey.String("my-application"),
        semconv.ServiceNamespaceKey.String("my-company-frontend-team"),
    ),
)
if err != nil {
    // Handle err
}
// Use the resource in your tracerprovider (or meterprovider)
tp := trace.NewTracerProvider(
    // ... other options
    trace.WithResource(res),
)

This blows up for me in production with:

panic: error initialising tracer: error creating tracing resource: 1 errors occurred detecting resource:
    * cannot merge resource due to conflicting Schema URL

Now first this error is extremely misleading because I am not trying to merge resources (resource.Merge). This code is creating a new resource using resource.New. Secondly nowhere is the schema version specified by myself, meaning that things blow up because of downstream dependencies being incompatible yet the code making it all look like it's fine. That is a horrible programming API.

Now this is where it gets even worse. I am an Otel layman, so I just look for obvious clues where I can fix something and this is where I think it gets even more confusing:

When I inspect the implementation of the gcp detector I can see that it's using semconv/v1.17.0:

image

Then when I look up the implementation of the go.opentelemetry.io/otel/sdk/resource package at version 1.17.0 it looks like it's using semconv/v1.21.0:

image image

WAT?

So how am I supposed to fix this issue? Is the expected developer journey for me to literally go through the source code of every published otel package version to try and find where semconv is set to 1.17.0 so it matches my GCP contrib package? I don't know, this is extremely unintuitive. Either this is a horrible developer experience or the API is so confusing that it's not obvious what to do to a regular develper without going deep into this topic.

dustinmoris commented 1 year ago

Probably related to https://github.com/open-telemetry/opentelemetry-go/issues/3769

dustinmoris commented 1 year ago

Related: https://github.com/open-telemetry/opentelemetry-go/issues/2341

pellared commented 11 months ago

Is there anything you want to add or can I close it as a duplicate of https://github.com/open-telemetry/opentelemetry-go/issues/2341?

PS. You can currently use https://github.com/MrAlias/otel-schema-utils to mitigate the issue.

pellared commented 11 months ago

Closing as duplicate of https://github.com/open-telemetry/opentelemetry-go/issues/2341

mdelapenya commented 10 months ago

PS. You can currently use https://github.com/MrAlias/otel-schema-utils to mitigate the issue.

After bumping the indirect otel deps, our compose module in our project (https://github.com/testcontainers/testcontainers-go) started to fail the tests with the error described in this issue:

FAIL TestDockerComposeAPIWithBuild (0.02s)
      compose_api_test.go:472: 
            Error Trace:    /home/runner/work/testcontainers-go/testcontainers-go/modules/compose/compose_api_test.go:472
            Error:          Received unexpected error:
                            cannot merge resource due to conflicting Schema URL
            Test:           TestDockerComposeAPIWithBuild
            Messages:       compose.Up()

As a downstream consumer of docker compose Go libs, how could we proceed then, as we do not interact with otel libs at all in the module, but just consume what the Docker compose module offers?

pellared commented 10 months ago

As a downstream consumer of docker compose Go libs, how could we proceed then, as we do not interact with otel libs at all in the module, but just consume what the Docker compose module offers?

@mdelapenya, currently you cannot do anything other than not bumping the dependencies. You would need to first bump (or ask for it) in the modules that interact with the OTel modules.

We would need to add schema transformation functionality into the SDK. Some work was done here https://github.com/open-telemetry/opentelemetry-go/pull/4503, but it requires a lot of work. It is also blocked by (dependent on) https://github.com/orgs/open-telemetry/projects/64.

Take notice that if you are bumping OTel which is a transitive dependency then the emitted telemetry by its users may be different (which could be unwelcome). This is the main reason why you get an error when the library is using a different semantic convention version than the resource detectors provided the SDK.

Please next time comment on the open issue instead of the closed one πŸ˜‰

mdelapenya commented 10 months ago

Please next time comment on the open issue instead of the closed one πŸ˜‰

My bad, I though this was more related, but you're totally right πŸ™