Global MeterProvider Get/Set API is under-defined

MrAlias commented 1 year ago

Currently the specification defines this API as follows:

Normally, the MeterProvider is expected to be accessed from a central place. Thus, the API SHOULD provide a way to set/register and access a global default MeterProvider.

This does not specify the API behavior when an "access" is made prior to a global being set. Nor does it specify the behavior of multiple set operations. This seems like a source of user confusion if implementations deviate in behavior.

If one implementation errors, another returns a no-op, and another returns a no-op implementation that will be updated to the first global MeterProvider set by a user, users that work across these implementations will not know what to expect.

Ideally implementations will behave the same. Unfortunately, that doesn't seem possible given the stable releases of certain implementations that have different behaviors.

Language	Get before set	Multi-sets
Go	Delegating No-op returned	after first set, latest set instance returned
Java	Configures from env or No-op (no delegation)	Log error
.NET	? (not sure)	?
Python	Delegating No-op returned	Log warning
JS	No-op (no delegation)	Log error
Ruby	No-op (no delegation)	after first set, latest set instance returned

Based on this survey, I'm not entirely sure how to proceed.

Should we recommend all return from the environment if configured otherwise a delegating No-op implementation when a get is made before a set?

Should all multi-sets log some warning/error?

cc @open-telemetry/go-approvers @open-telemetry/java-approvers @open-telemetry/dotnet-approvers @open-telemetry/python-approvers @open-telemetry/javascript-approvers @open-telemetry/erlang-approvers @open-telemetry/ruby-approvers

Oberon00 commented 1 year ago

Should we recommend all return from the environment if configured

This is not generally possible, as you also need to consider the API-only case.

E.g. a common use case (although discouraged in some languages) might be to call the global getter somewhere deep in a library that has built-in instrumentation. That library would depend on the API package but not any SDK. Thus, there is no code that could interpret environment variables.

.NET, BTW, can probably be left out of this discussion, it is not really "OpenTelemetry" in this regard.

dyladan commented 1 year ago

FWIW I think adding delegation is something that can be done in a non-breaking way. The JS TracerProvider does delegate and it is a feature that I am hoping to add to metrics as well. I would argue the "correct" behavior is a delegating no-op. As far as multi-set, I think that gets much trickier. I prefer the logged warning (no strong opinion on warn/error level).

MrAlias commented 1 year ago

Should we recommend all return from the environment if configured

This is not generally possible, as you also need to consider the API-only case.

E.g. a common use case (although discouraged in some languages) might be to call the global getter somewhere deep in a library that has built-in instrumentation. That library would depend on the API package but not any SDK. Thus, there is no code that could interpret environment variables.

.NET, BTW, can probably be left out of this discussion, it is not really "OpenTelemetry" in this regard.

This makes sense. :+1:

I think to your's and @dyladan point, we should recommend all return a delegating No-op implementation when a get is made before a set, but add a provisional option for an environment based provider for those able to do so. I think the option here is necessary to not make the Java implementation doesn't become incompatible, right?

Oberon00 commented 1 year ago

Please definitely discuss that with the Java maintainers @open-telemetry/java-maintainers, I think it is a bit more intricate there, autoconfiguration is a third package that is separate from the SDK and not automatically used unless you use the auto-instrumenting agent (which I believe is probably the most common way of "using" the Java language client)

jack-berg commented 1 year ago

Hi - let me shed some light on the java behavior:

There are 3 different components/artifacts in play: opentelemetry-api, opentelemetry-sdk, and opentelemetry-sdk-extension-autoconfigure.
In general, java's GloablOpenTelemetry.get has an invariant that every call to get is guaranteed to return the same response. Calling get before set causes set to throw.
The behavior of calling get is dependent on whether opentelemetry-sdk-extension-autoconfigure is on the classpath. And the behavior recently changed in a key way, which will be released for the first time on 1/6.
If you call get and opentelemetry-sdk-extension-autoconfigure is not on the classpath, we'll return a noop. All future calls to get also return a noop.
If you call get and opentelemetry-sdk-extension-autoconfigure is on the classpath, we MAY return an instance initialized by the environment. Let's call this the "autoconfigured" instance. Until recently, we always initialized and returned the autoconfigured instance. Recently, we made it so initializing and returning the autoconfigured instance is opt in via an environment variable. See the PR for a lengthy discussion on why we did this. If we return an autoconfigured instance, the same instance will be returned to future calls to get. If the user has not opted into the autoconfigured instance, the behavior is the same as when opentelemetry-sdk-extension-autoconfigure is not on the classpath. See above.
This all occurs when not using the opentelemetry java agent (i.e. the auto-instrumenting agent). If using the opentelemetry java agent, the same opentelemetry-sdk-extension-autoconfigure initialization code is called during agent initialization and the result is used to call GlobalOpenTelemetry.set. Calls to GlobalOpenTelemetry.get receive this autoconfigured instance.

I think we've arrived at a good place for java: The invariant that all calls to GlobalOpenTelemetry.get receive the same result simplifies initialization issues which might otherwise be head scratchers since either everything is going to work or nothing is going to work. If desired, you can opt in to having GlobalOpentelemetry.get trigger initialization of an autoconfigured instance, which helps in corner cases where its not possible to call GlobalOpenTelemetry.set before instrumentation code initializes.

jmacd commented 1 year ago

Here is a historical reference on this topic, which indicates that Go was influenced by Ruby: https://github.com/open-telemetry/oteps/pull/11

My recollection of the discussion and debate from then is that it matters whether the language in question has a "proper" and well-adopted mechanism for dependency injection and (possibly equivalently) dynamic loading of code. Golang stands out as a language without either of these features, and this is why (IMO) the "delegating no-op" should be supported in Golang.

A language with rules and accepted DI mechanism makes the delegating no-op mechanism unnecessary (IMO) because the DI mechanism is a manner of delegation.

I don't have a strong opinion about the "multi-set" behavior that is best. IIRC it was Ruby that led the way here, so I'd like to ask @mwear what he thinks?

tsloughter commented 1 year ago

So no changes to the spec make us no longer compliant I can describe the Erlang Provider get/set setup:

Providers are registered in a central place by utilizing Erlang's process registry. Each Provider is a process that implements a particular (TracerProvider or MeterProvider) API. On startup of the SDK it will configure the global provider which is registered in the process registry.

A noop tracer/meter is returned if the global provider is called before it has initialized.

Attempts to start another provider registered to the same name will fail, returning already_started and the process ID of the existing Provider.

We don't actually have functions for set/register because it relies on the Erlang process registry and is best left to the process itself.

Long story short, we'd want/need the spec to not require that multiple attempts to set/register the global Provider override the existing Provider.

pellared commented 1 year ago

If I understand correctly then the same "issue" is with TraceProvider and LoggerProvider.
https://github.com/open-telemetry/opentelemetry-specification/blob/025f8fc1a12dda35e42a25c96a0ca4e2c52e34c8/specification/metrics/api.md?plain=1#L62-L64 I like that the Metrics API spec defines the MeterProvider as a component instead of a class as it makes it more open to interpretation (and more abstract) so that the languages can better adopt it to their ecosystem. Therefore in my eyes Erlang and .NET are complaint. Personally, I propose to use the same wording in Trace API and Logs API spec.
How about just leaving a suggestion on how the the Get/Set could be implemented? E.g. Get returns delegating noop provider? If no Set was called and set logs warning if there current provider was NOT a noop provider. Take notice that Get and Set is a abstract thing e.g. it can be done by Erlang process registry or .NET Diagnostics package.

MrAlias commented 1 year ago

Get returns noop if no Set was called

Why not a delegating provider?

pellared commented 1 year ago

Why not a delegating provider?

That's what I meant 😄 Sorry for lack of precision.

(edited my comment)

open-telemetry / opentelemetry-specification

Global MeterProvider Get/Set API is under-defined #3068