open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.33k stars 1.44k forks source link

Proposal: split signal-aware packages into per-signal subpackages #10207

Closed dmathieu closed 2 months ago

dmathieu commented 4 months ago

Is your feature request related to a problem? Please describe.

As we're planning the addition of profiles into the collector, we will need to add new interfaces into several packages (consumer, receiver, processor, connector). Because the profiles signals is experimental, we would like to avoid making changes to those APIs that would be making them unstable.

Describe the solution you'd like

Split each of the packages that have interfaces specific to signals into their own module/package. We would start with consumer, as that API is the closest to 1.0. But each package that has a signal-specific implementation would be impacted.

We could do this two ways:

With this change, any new signal added in the future could much more easily implement its own interfaces even though the main packages are stable.

Describe alternatives you've considered

The alternatives could be to:

dmathieu commented 4 months ago

cc @mx-psi

dmathieu commented 4 months ago

If we do this, there will be the question of deprecating/removing the current interfaces. Since those packages haven't reached 1.0, it should be doable.

djaglowski commented 4 months ago

How would this work for connectors, which depend on pairs of signal types?

dmathieu commented 4 months ago

It looks like connectors have a specific stability per signal/signal (https://github.com/open-telemetry/opentelemetry-collector/blob/main/connector/connector.go#L108-L118). So maybe that module could remain as it is. I don't know how it would work once it reaches 1.0 though.

mx-psi commented 4 months ago

For consumer, I think it should be easy to do the split and that we should go ahead with it. That way we avoid changing import paths once we move profiles to stable. I think it is also more pressing to decide this for consumer, since it's the one that is closest to 1.0.

For the component types, I think we have more time to think about those, so it may be good if the profiling folks can do an RFC for this part or something similar. There also seem to be several independent pieces that we want to answer here:

mx-psi commented 4 months ago

On the 2024-05-22 meeting, we discussed this but reached no consensus. A summary of the discussion:

How to move from here:

bogdandrutu commented 4 months ago

@mx-psi some things to discuss/understand:

  1. Since any experimental signal will eventually end in the service package, does it mean service will never be stable API? Does it mean that our whole collector will not be stable? I think we have to adopt something like gRPC, even though I understand that is not ideal, is the only option we have for things like service, unless we are willing to copy code and have also an "experimentalservice", and go deeper.
  2. What is so bad of asking users to do a import rename for something experimental, the idea of experimental is to not be in this state for a long time, and ideally not too many external dependencies are taking a dependency on this.
  3. Profile data are extremely experimental in my opinion, we haven't yet decided that this is a "good" to try version in the proto repo (see https://github.com/open-telemetry/opentelemetry-proto/pull/559). Because of this I am not sure we should rush to add it yet.
dmathieu commented 4 months ago

Because of this I am not sure we should rush to add it yet.

The reason I am working on this is that one of the requirements from the profiling SIG to accept the Elastic profiling agent donation is to have it running as a collector receiver. We can't do that unless the collector has the proper interfaces. Also, while I agree the proto is still extremely experimental, the pdata is already within the collector (which is the part that could have breaking changes), under its own module so it has its own versioning model until it becomes stable.

Also, being able to receiver/forward/store profiles in OTLP is going to be a big part in validating that protocol.

The goal of this refactoring here is to make it easier to have experimental/beta packages within the collector, which would be very hard if we have to modify stable package APIs.

mx-psi commented 4 months ago

Since any experimental signal will eventually end in the service package, does it mean service will never be stable API? Does it mean that our whole collector will not be stable? I think we have to adopt something like gRPC, even though I understand that is not ideal, is the only option we have for things like service, unless we are willing to copy code and have also an "experimentalservice", and go deeper.

@bogdandrutu

I think the two goals that we should have here are:

  1. A stable module does not have references to experimental symbols in its public API (other than pdata?)
  2. You can only use experimental behavior if you have to have enabled it explicitly (e.g. by using an unstable module, a feature gate, a build tag...)

Assuming we solve for these two at the lower levels, for Service I can think of a few alternatives that are not the grpc-go approach:

  1. Having two Service constructors (click to expand) `service` package (stable, no profiles support) ```go type Service = internal.Service type Config = internal.Config func New(...) (*Service, error) { return internal.NewService(...) } ``` `profileservice` package (drop-in replacement in terms of symbol names, experimental) ```go type Service = internal.Service // interchangeable with service.Service type Config struct { internal.Config // Extra profiles stuff goes here } func New(...) (*Service,error) { return internal.NewService(..., WithProfiles()) } ```
  2. Have a single package, but make service.New return an error if you use any profiles related functionality without enabling a dedicated feature gate. We would have to adda Profiles field to the pipelines struct and commit to supporting this (I feel like that's fine)
  3. Have an unstable Setting for enabling profiles (click to expand) This works better with the `Option`s pattern. For example, on `exporte` we would do something like: In package `exporter` (stable): ```go func NewFactory(..., options ...FactoryOption) Factory ``` In another experimental package: ```go func WithProfiles(...) exporter.FactoryOption ``` That way you have to explicitly ask for the functionality by importing an experimental module. In `Service`, this would mean either a refactor, or some field in `service.Settings` that you can only fill in it using an experimental package.
  4. Have all functionality related to profiles under a build tag

Do any of these work for you? Ultimately, I am more concerned about modules below service, since those are the ones that I would expect to cause more trouble with the builder in terms of build errors, so I am not entirely against just labeling the fields as experimental at the service level.

What is so bad of asking users to do a import rename for something experimental, the idea of experimental is to not be in this state for a long time, and ideally not too many external dependencies are taking a dependency on this.

Not a big deal, I think the two points above are more important. The avoiding renames part is a nice-to-have.

mx-psi commented 4 months ago

Sorry for the delay, I discussed this with @dmathieu already in private, but so that it is more 'official' and everybody else can participate, this is what the Collector Stability Working Group discussed on 2024-05-29:

  1. We will not do refactors that split signal-aware packages into per-signal subpackages. A separate package or packages for experimental signals are okay. Making a module signal-agnostic may be okay (specially if it's not close to 1.0).
  2. We want the profiling work to be independent from 1.0 work and not slow it down. We should be able to stabilize 1.0 modules if they are ready for all other signals. We want the Profiling SIG to explore how exactly to do this, specially for service (see above discussion). The two most popular alternatives (other may be suggested) are:
    • Using separate modules for symbols related to experimental signals (e.g. see #10253)
    • Marking as experimental using comments in the style of grpc-go
      1. We want to have a full picture of any proposal before embarking on it.

As next steps, (speaking personally for this part), I would like to see a design document that lays out the proposal from the Profiling SIG on how to do this, possibly supported by a PoC like the one @dmathieu is working on over at #10307. We don't have a formal structure for Collector-specific design documents but you can check this as a recent example. We don't need to figure out every detail on the RFC; the aim should be to have a high level vision of the proposal in a way that allows us to consider pros and cons.

mx-psi commented 2 months ago

Closed by #10375, we decided not to split by signal but instead do separate modules for experimental signals only