open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.52k stars 1.48k forks source link

Experimental API in stable modules #11586

Open dmitryax opened 4 weeks ago

dmitryax commented 4 weeks ago

Given that we have a resolution of https://github.com/open-telemetry/opentelemetry-specification/issues/4257 for OTel libraries, this issue is intended to make a decision on how to approach the same problem in OTel Collector.

We have a few 1.x modules in the collector that we need to keep working on and adding features. It isn't always possible to add those features as stable from day 1. Even if it potentially can be possible to move those features to different 0.x modules, there are several problems associated with this approach:

  1. There has to be complexity overhead to maintain keep them in a separate module;
  2. Once stabilized, they must be moved back, which will be an unnecessary breaking change for users who opt-in to try an experimental feature.

A particular example is the new entity data type. Potentially, it could be possible to start introducing it the same way it was done for profiling. But there is one important difference: the new entity signal also introduces a new field to stable resource message, which is reflected in the stable pdata/pcommon module, see new method in https://github.com/open-telemetry/opentelemetry-collector/pull/11282/files#diff-20c7903a84235f237b5170d75351abbd6c2bb7069aff946e369a01c6e36022cf. That field will reference an experimental field in the stable resource protobuf message. If we use pdatagen as is, this will be added as an extra method to the Resource struct, but that method has to be marked as experimental in a 1.x module. It may be possible to move that method to another experimental module as a function, but it's significantly more complicated and not intuitive API compared to other auto-generated pdata API. As mentioned before, this will also have to be moved back to the Resource struct once it's stabilized, which is another breaking change.

The suggestion is to introduce the same rule for the Collector as decided for OTel libraries in https://github.com/open-telemetry/opentelemetry-specification/pull/4270 and allow having experimental APIs in stable modules.

Looking for feedback from @open-telemetry/collector-approvers.

cc @tigrannajaryan

mx-psi commented 3 weeks ago

The suggestion is to introduce the same rule for the Collector as decided for OTel libraries in https://github.com/open-telemetry/opentelemetry-specification/pull/4270 and allow having experimental APIs in stable modules.

I don't think this spec PR allows having experimental API in stable modules (or at least, that was not my interpretation when I approved it :smile:). It does so with some important restrictions, that I think are just not possible in Go.

Even if it potentially can be possible to move those features to different 0.x modules, there are several problems associated with this approach [...]

I agree that separating APIs has problems, I however think the downsides, while unlikely, are bigger with not following semver on a stable module (see https://github.com/open-telemetry/opentelemetry-specification/pull/4270#discussion_r1812101352 for the most recent iteration of this point). If we are trying (or want to eventually try) to build an ecosystem around the Collector Go APIs (e.g. helpers for building components) I think it's important to follow the expectations of Go tooling, which are to follow semver.

For adding a single field, I would be okay with an experimental field with the understanding that it will be there forever. If we want to change the name, we would deprecate the old name and keep support for it forever. If we want to change its type, we would have to change it's name as well.

This may be a good candidate for an RFC (and possibly for a vote if there is no agreement). We already wanted to have an RFC/docs about experimental modules (see #11436), so we can solve both things.

tigrannajaryan commented 3 weeks ago

I don't think this spec PR allows having experimental API in stable modules (or at least, that was not my interpretation when I approved it 😄).

The spec does not talk about modules. It says that languages must be able to add unstable methods to stable signals. For example Trace signal is stable. It must be possible to a new method in Trace API, such that the method is marked in "Development". "Development" methods of course can be changed in breaking ways in the spec and can be removed.

How exactly languages do it and whether signal=module is a language concern.

If this is not clear in the spec then we should clarify further.

TylerHelmuth commented 3 weeks ago

If we want to change the name, we would deprecate the old name and keep support for it forever. If we want to change its type, we would have to change it's name as well.

We wouldn't have to keep it around forever right, only until the next breaking change?

mx-psi commented 3 weeks ago

How exactly languages do it and whether signal=module is a language concern.

Thanks for clarifying, agreed, my wording was not the best

If we want to change the name, we would deprecate the old name and keep support for it forever. If we want to change its type, we would have to change it's name as well.

We wouldn't have to keep it around forever right, only until the next breaking change?

Until the next major version, yup

tigrannajaryan commented 3 weeks ago

Please keep in mind the spec says the additional unstable methods to stable APIs should be opt-in. Anybody who opts in should expect it to break. This makes things simpler and allows to maintain semver on stable portion while having an unstable portion that can change and is not bound by stable semver guarantees.

For example in Go if you use interfaces you probably can do the following.

This is in stable module github.com/tigrannajaryan/goapi/stable, 1.0.0:

package stable

type MyStableAPI interface {
    StableMethod()
}

func GetMyStableAPI() MyStableAPI {
    return &myAPIImpl{}
}

type myAPIImpl struct {}

func (*myAPIImpl) StableMethod() {}

// This is not visible unless you opt-in.
func (*myAPIImpl) UnstableMethod() {}

This is in unstable module github.com/tigrannajaryan/goapi/unstable, 0.0.1:

package unstable

type MyApiUnstableExtended interface {
    UnstableMethod()
}

Here is how you use it, with opt-in:

import (
    "github.com/tigrannajaryan/goapi/stable"
    "github.com/tigrannajaryan/goapi/unstable"
)

func main() {
    api := stable.GetMyStableAPI()

    // Opt-in into unstable method.
    eapi := api.(unstable.MyApiUnstableExtended)
    eapi.UnstableMethod()
}

This allows you to easily remove or rename UnstableMethod and remove the entire MyApiUnstableExtended if you want without breaking the semver promise on github.com/tigrannajaryan/goapi/stable.

mx-psi commented 3 weeks ago

@tigrannajaryan This is an approach we have taken and one that I even have a PR to document it on #11482 :) I think this is the kind of approach we should follow whenever possible, my assumption was that we couldn't on this particular case.

dmitryax commented 3 weeks ago

We discussed the issue on Monday during the Collector stability SIG:

In the meantime, I can try to update https://github.com/open-telemetry/opentelemetry-collector/pull/11282 in a way that doesn't introduce any new methods to stable pdata/pcommon module to see how much extra effort it requires to maintain the API and how much change will be required from the API user to apply once entities stabilize.

mx-psi commented 1 week ago

I think we can adopt the same stance open-telemetry/opentelemetry-proto/pull/597: we will deprecate the field if it turns out it is not needed