open-telemetry / opentelemetry-specification

Specifications for OpenTelemetry
https://opentelemetry.io
Apache License 2.0
3.76k stars 890 forks source link

Stabilize Logger.Enabled #4208

Open pellared opened 2 months ago

pellared commented 2 months ago

Stabilize Logger.Enabled API

Blockers:

pellared commented 2 months ago

Question to @open-telemetry/technical-committee: Do we want to stabilize the Logger.Enabled API sooner than we stabilize the spec defining how SDK implements it? Or do we want to stabilize Enabled for API and SDK at the same time?

cijothomas commented 2 months ago

Question to @open-telemetry/technical-committee: Do we want to stabilize the Logger.Enabled API sooner than we stabilize the spec defining how SDK implements it? Or do we want to stabilize Enabled for API and SDK at the same time?

Same for Metrics too: https://github.com/open-telemetry/opentelemetry-specification/pull/4219/files#r1767789558

pellared commented 1 month ago

The lack of stabilization of Logger.Enabled API blocks stabilization of OTel Go Logs. Logger.Enabled API is required for bridging most popular Go logging libraries (including slog from the Go standard library).

From OTel Go perspective, the SDK support can be experimental. See: https://pkg.go.dev/go.opentelemetry.io/otel/sdk/log/internal/x.

This is currently the only known blocker for stabilizing the OTel Go Logs.

pellared commented 1 month ago

@open-telemetry/technical-committee, are you able to revalidate if the issues listed as blockers are still seen as blockers or if they can be addressed after stabilization of Logger.Enabled in Logs Bridge API?

Personally, I think the main blocker is to have at least 3 prototypes of the API in different languages.

tigrannajaryan commented 1 month ago

To clarify the process: we expect 3 prototypes in 3 different languages that can be used by the end users, so that they can try the feature, provide feedback, submit bugs and issues about it. This is a necessary process before the spec section is marked "Stable".

From this perspective a PR does not counts as a prototype since it is not easily usable by the end users. A PR is fine for proposing new experimental features and demonstrating how they would work, but it is not enough for stabilizing the spec.

The lack of stabilization of Logger.Enabled API blocks stabilization of OTel Go Logs. Logger.Enabled API is required for bridging most popular Go logging libraries (including slog from the Go standard library).

@pellared you either need to find a way to have unstable APIs in Go or wait until other languages implement the prototypes. Either way the ability to have unstable APIs is very valuable and this is likely to come up again as Otel evolves and we keep adding new experimental APIs to existing signals.

--

As a side note: I encourage using maturity levels between "Development" and "Stable" to signal increasing level of confidence in the capability (both in the spec and the SDK). For example if we have 1-2 prototypes then we can move the maturity level of the feature from "Development" to "Alpha" or "Beta" to signal it is moving closer to the "Stable" state.

pellared commented 1 month ago

OK. See we need 3 different languages to have it released as experimental API.

Here is how the experimental Logger.Enabled API is currently defined in a 3 languages:

I will do my best to work on this with others to move this forward (as we have inconsistencies).

@pellared you either need to find a way to have unstable APIs in Go

All major log bridges need it so it does not even make sense to stabilize the rest as the Logs API would not be usable in Go ecosystem. From https://github.com/open-telemetry/opentelemetry-specification/issues/3917:

Multiple logging libraries in Go provide this optimization^1^3. If the Go SIG is going to be able to support these critical logging systems we need this functionality in the Logs Bridge API.

or wait until other languages implement the prototypes.

We then need to wait for other languages to add it.

As a side note: I encourage using maturity levels between "Development" and "Stable" to signal increasing level of confidence in the capability (both in the spec and the SDK)

I am not sure if we can use Alpha and Beta in the spec documents as these are not defined in https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md#lifecycle-status.

tigrannajaryan commented 1 month ago

I am not sure if we can use Alpha and Beta in the spec documents as these are not defined in https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md#lifecycle-status.

We can bring as many levels from OTEP 0232 to the spec as we will believe is useful. I started with 3 but we can bring more if we feel there is value. I personally think it can be valuable to have more granularity between Development (the most immature) and Stable (the most mature). It is an important signal and having just a binary value for it I think is not nuanced enough. Stabilization is a process, often a long one at Otel. As you move along that process it is important to indicate the progress by updating the level labels.

MrAlias commented 1 month ago

From this perspective a PR does not counts as a prototype since it is not easily usable by the end users. A PR is fine for proposing new experimental features and demonstrating how they would work, but it is not enough for stabilizing the spec.

FWIW this is a change in policy. Many features have been stabilized that relied on "Go implementations" which were just PRs.

I'm not sure it is fair to make this change in policy in such an ad hoc manner.

tigrannajaryan commented 1 month ago

@MrAlias my post is a result of a discussion by a few TCs while triaging this issue, so it is not an official policy change yet.

I tried finding the current policy but couldn't. This document does not seem to have an opinion about what criteria must be met before spec stabilization.

If anyone is aware of where do we state how many prototypes are needed please post the link.

If the policy does not exist in written form or we need to modify it I will create an issue so that we can discuss and formalize it.

Let's keep this issue open for now so that we can apply consistent rules after we clarify what the rules are.

tigrannajaryan commented 1 month ago

I am gonna move this back to TC inbox.

jack-berg commented 1 month ago

Removed from TC inbox. The prototype requirement is being separately tracked, and there are other blockers preventing stability.

jpkrohling commented 3 weeks ago

Labeling with follow-up, as I understand we need the 3 implementations before this can be declared stable.

trask commented 3 weeks ago

removing triage:followup since the automation will automatically add it back 2 weeks after additional comments / ref links