open-telemetry / opentelemetry-go

OpenTelemetry Go API and SDK
https://opentelemetry.io/docs/languages/go
Apache License 2.0
5.21k stars 1.05k forks source link

The global instance holder should only be set once #5390

Open Cirilla-zmh opened 4 months ago

Cirilla-zmh commented 4 months ago

Problem Statement

We are committed to developing a compile-time instrumentation based on the opentelemetry-go project.

However, we've encountered difficulties when trying to provide automatic instrumentation for applications who already depended on the go-sdk — various GlobalHolders in the global package (like tracerProviderHolder) allow users to modified them repeatedly.

This behavior could lead to the global TracerProvider and other components we've prepared in advance for the users being overwritten. This means we must require users to adhere to a specific usage pattern for the opentelemetry-go and must pay attention to the order in which the set methods are called.

For example, if the application being instrumented is structured as follows:

func main() {
    var tracerProvider *trace.TracerProvider
    tracerProvider = trace.NewTracerProvider(
        trace.WithResource(resource.NewSchemaless(attribute.String("hello11", "11"))),
    )
    otel.SetTracerProvider(tracerProvider)

    // do something
}

The TracerProvider instance created by user will be registered into global holder, and therefore, the TracerProvider we previously registered would be overwritten.

Another question worth considering is whether modifying the global instance at runtime is necessary. I believe this feature is seldom used and could lead to unnecessary perplexity, as any process has the authority to modify the global instance.

Proposed Solution

Perhaps we should implement the global holder in a way that only allows for once registration.

Alternatives

Taking into account that this changes could be harmful, adjusting the modification capability of the global holder to be configurable is also a solution worth evaluating.

Prior Art

In opentelemetry-java projects, the GlobalOpenTelemetry utilizes this strategy: https://github.com/open-telemetry/opentelemetry-java/blob/c71c4d983cf2bf99f3d130ec50d131c07c47fd8a/api/all/src/main/java/io/opentelemetry/api/GlobalOpenTelemetry.java#L94-L117

dmathieu commented 4 months ago

Doing this would be a breaking change. We can't do that without a v2 of the API (which also probably shouldn't happen without a v2 of the specification).

For the record, the specification does not say anything about setting the tracer provider multiple times. So both the java and our approach are valid. https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#tracerprovider

Cirilla-zmh commented 4 months ago

Thank you for your response!

For the record, the specification does not say anything about setting the tracer provider multiple times. So both the java and our approach are valid.

Yes, you are right. However, I believe it is necessary—especially for developers of go-auto-instrumentation.

Doing this would be a breaking change. We can't do that without a v2 of the API (which also probably shouldn't happen without a v2 of the specification).

We might consider introducing an configurable mode to control that. By default, we continue to allow users to register the global instance multiple times. While in go-auto-instrumentation, we switch to the agent mode that disallows users from modifying the instance kept by global holder.

MrAlias commented 4 months ago

In your proposed solution, how do you plan to ensure you are the fist to write to the global state and not another? What do you do if you are not the first and the global state cannot be changed anymore?

Relying on global values is inherently going to have conflict. We recommend you design your systems so an explicit provider is received and used.

Cirilla-zmh commented 4 months ago

In your proposed solution, how do you plan to ensure you are the fist to write to the global state and not another? What do you do if you are not the first and the global state cannot be changed anymore?

In our implementation, we define all the initialization logic for the instrumentation within the init() method, ensuring that the main file of the application being instrumented imports this module at compile time. Golang guarantees that our init method is executed first.

Relying on global values is inherently going to have conflict. We recommend you design your systems so an explicit provider is received and used.

@MrAlias Thank you for your suggestion. It is indeed a feasible option. However, this means that spans created by users with the go-sdk would be completely separate from those created by us, which would result in a loss of extensibility.

I believe that instrumentation should be able to extend based on the SDK without the users needing to be overly aware of the interaction and existence of the two during development.

dmathieu commented 4 months ago

They would be completely separate if folks redefine the tracer exporter. The alternative here seems to be to over-document that folks shouldn't redefine the tracer exporter if they use auto-instrumentation.

Cirilla-zmh commented 4 months ago

The alternative here seems to be to over-document that folks shouldn't redefine the tracer exporter if they use auto-instrumentation.

Yes, defining this behavior as a usage standard and explaining its impact in specification represents the minimal change. We will document this in the project's documentation.

I still believe that adding an instrumentation mode would be a more reasonable implementation. If you think this should not be pursued at the moment, feel free to close this issue.

Thank you once again!

MrAlias commented 4 months ago

In our implementation, we define all the initialization logic for the instrumentation within the init() method, ensuring that the main file of the application being instrumented imports this module at compile time. Golang guarantees that our init method is executed first.

I think you have that backwards. init is run for all imported packages before your main file init is run: https://david-yappeter.medium.com/init-in-go-programming-31e2c2bc2371

MrAlias commented 4 months ago

@MrAlias Thank you for your suggestion. It is indeed a feasible option. However, this means that spans created by users with the go-sdk would be completely separate from those created by us, which would result in a loss of extensibility.

I believe that instrumentation should be able to extend based on the SDK without the users needing to be overly aware of the interaction and existence of the two during development.

If you wan to override a user defined SDK you need to do something like https://github.com/open-telemetry/opentelemetry-go-instrumentation does for the global implementation, or you need to have the user buy into the process and import and use a wrapper you provide.

Cirilla-zmh commented 4 months ago

I think you have that backwards. init is run for all imported packages before your main file init is run: https://david-yappeter.medium.com/init-in-go-programming-31e2c2bc2371

Apologies for any confusion my previous explanation might have caused.

To clarify, we have designed the initialization process for the instrumentation to reside within the init() method of a designated package. Then we take measures to ensure that during compilation, this package is inserted into the customer's main file in such a manner that it is positioned at the very top of the import list. This placement guarantees the correct initialization order as per the Go language's execution model.