Proposal: split signal-aware packages into per-signal subpackages

dmathieu commented 4 months ago

Is your feature request related to a problem? Please describe.

As we're planning the addition of profiles into the collector, we will need to add new interfaces into several packages (consumer, receiver, processor, connector). Because the profiles signals is experimental, we would like to avoid making changes to those APIs that would be making them unstable.

Describe the solution you'd like

Split each of the packages that have interfaces specific to signals into their own module/package. We would start with consumer, as that API is the closest to 1.0. But each package that has a signal-specific implementation would be impacted.

We could do this two ways:

A top-level package per signal, with subpackages per feature:traces/consumer, traces/receiver.
A top-level package per feature, with subpackages per signal: consumer/traces, receiver/traces.

With this change, any new signal added in the future could much more easily implement its own interfaces even though the main packages are stable.

Describe alternatives you've considered

The alternatives could be to:

Put the new profiles interfaces directly in the existing packages, which may cause API stability issues.
Create new packages for the experimental profiles packages, and then move them into the main one when they become stable, which would force folks to update their import paths.

dmathieu commented 4 months ago

cc @mx-psi

dmathieu commented 4 months ago

If we do this, there will be the question of deprecating/removing the current interfaces. Since those packages haven't reached 1.0, it should be doable.

djaglowski commented 4 months ago

How would this work for connectors, which depend on pairs of signal types?

dmathieu commented 4 months ago

It looks like connectors have a specific stability per signal/signal (https://github.com/open-telemetry/opentelemetry-collector/blob/main/connector/connector.go#L108-L118). So maybe that module could remain as it is. I don't know how it would work once it reaches 1.0 though.

mx-psi commented 4 months ago

For consumer, I think it should be easy to do the split and that we should go ahead with it. That way we avoid changing import paths once we move profiles to stable. I think it is also more pressing to decide this for consumer, since it's the one that is closest to 1.0.

For the component types, I think we have more time to think about those, so it may be good if the profiling folks can do an RFC for this part or something similar. There also seem to be several independent pieces that we want to answer here:

Should we split the <componentkind>.Factory? (leaning towards no here since it's too much work for only a small benefit, but we need to figure out the way of adding profiles support, maybe with an optional interface?)
Should we split the <componentkind>.Builder struct? To be honest here, maybe we should not have Builder at all as part of the public API, this seems mostly only useful for otelcol/service.
Should we split the signal-specific structs into signal-specific packages? This one seems easier to do, and it would lead to a smoother transition from pre-1.0 profiles to post-1.0 profiles

mx-psi commented 4 months ago

On the 2024-05-22 meeting, we discussed this but reached no consensus. A summary of the discussion:

There is no consensus on whether we should split this or not. The main point of contention is whether this would actually matter if the only experimental thing here is the pprofile struct and we promise not to break anything else.
- There is no good way to label experimental symbols in Go, but there are libraries that do this in the wild and there is at least one case where this broke the ecosystem quite hard (see https://github.com/open-telemetry/opentelemetry-collector/issues/4832#issuecomment-1039394229 for details)
- We can always do 2.0 if we need to (but this is costly in terms of maintenance)
Consistency is an aspect to take into consideration here, if we split things we should find a holistic solution before and have some idea of what to do before making changes
- We are already inconsistent in that we split by signal in pdata, but not in other parts of the Collector codebase. It's unclear whether this inconsistency is justified or not.
- We need to figure out what to do with connectors specifically, since it seems to be a contentious case for consistency
Deciding on this blocks 1.0, and specifically blocks consumer 1.0

How to move from here:

Figure out what kinds of changes could cause breakage
Make a concrete proposal (draft PR? RFC?) of the split if we want to split

bogdandrutu commented 4 months ago

@mx-psi some things to discuss/understand:

Since any experimental signal will eventually end in the service package, does it mean service will never be stable API? Does it mean that our whole collector will not be stable? I think we have to adopt something like gRPC, even though I understand that is not ideal, is the only option we have for things like service, unless we are willing to copy code and have also an "experimentalservice", and go deeper.
What is so bad of asking users to do a import rename for something experimental, the idea of experimental is to not be in this state for a long time, and ideally not too many external dependencies are taking a dependency on this.
Profile data are extremely experimental in my opinion, we haven't yet decided that this is a "good" to try version in the proto repo (see https://github.com/open-telemetry/opentelemetry-proto/pull/559). Because of this I am not sure we should rush to add it yet.

dmathieu commented 4 months ago

Because of this I am not sure we should rush to add it yet.

The reason I am working on this is that one of the requirements from the profiling SIG to accept the Elastic profiling agent donation is to have it running as a collector receiver. We can't do that unless the collector has the proper interfaces. Also, while I agree the proto is still extremely experimental, the pdata is already within the collector (which is the part that could have breaking changes), under its own module so it has its own versioning model until it becomes stable.

Also, being able to receiver/forward/store profiles in OTLP is going to be a big part in validating that protocol.

The goal of this refactoring here is to make it easier to have experimental/beta packages within the collector, which would be very hard if we have to modify stable package APIs.

mx-psi commented 4 months ago

Since any experimental signal will eventually end in the service package, does it mean service will never be stable API? Does it mean that our whole collector will not be stable? I think we have to adopt something like gRPC, even though I understand that is not ideal, is the only option we have for things like service, unless we are willing to copy code and have also an "experimentalservice", and go deeper.

@bogdandrutu

I think the two goals that we should have here are:

A stable module does not have references to experimental symbols in its public API (other than pdata?)
You can only use experimental behavior if you have to have enabled it explicitly (e.g. by using an unstable module, a feature gate, a build tag...)

Assuming we solve for these two at the lower levels, for Service I can think of a few alternatives that are not the grpc-go approach:

Having two Service constructors (click to expand)
`service` package (stable, no profiles support) ```go type Service = internal.Service type Config = internal.Config func New(...) (*Service, error) { return internal.NewService(...) } ``` `profileservice` package (drop-in replacement in terms of symbol names, experimental) ```go type Service = internal.Service // interchangeable with service.Service type Config struct { internal.Config // Extra profiles stuff goes here } func New(...) (*Service,error) { return internal.NewService(..., WithProfiles()) } ```
Have a single package, but make service.New return an error if you use any profiles related functionality without enabling a dedicated feature gate. We would have to adda Profiles field to the pipelines struct and commit to supporting this (I feel like that's fine)
Have an unstable Setting for enabling profiles (click to expand)
This works better with the `Option`s pattern. For example, on `exporte` we would do something like: In package `exporter` (stable): ```go func NewFactory(..., options ...FactoryOption) Factory ``` In another experimental package: ```go func WithProfiles(...) exporter.FactoryOption ``` That way you have to explicitly ask for the functionality by importing an experimental module. In `Service`, this would mean either a refactor, or some field in `service.Settings` that you can only fill in it using an experimental package.
Have all functionality related to profiles under a build tag

Do any of these work for you? Ultimately, I am more concerned about modules below service, since those are the ones that I would expect to cause more trouble with the builder in terms of build errors, so I am not entirely against just labeling the fields as experimental at the service level.

What is so bad of asking users to do a import rename for something experimental, the idea of experimental is to not be in this state for a long time, and ideally not too many external dependencies are taking a dependency on this.

Not a big deal, I think the two points above are more important. The avoiding renames part is a nice-to-have.

mx-psi commented 4 months ago

Sorry for the delay, I discussed this with @dmathieu already in private, but so that it is more 'official' and everybody else can participate, this is what the Collector Stability Working Group discussed on 2024-05-29:

We will not do refactors that split signal-aware packages into per-signal subpackages. A separate package or packages for experimental signals are okay. Making a module signal-agnostic may be okay (specially if it's not close to 1.0).
We want the profiling work to be independent from 1.0 work and not slow it down. We should be able to stabilize 1.0 modules if they are ready for all other signals. We want the Profiling SIG to explore how exactly to do this, specially for service (see above discussion). The two most popular alternatives (other may be suggested) are:
- Using separate modules for symbols related to experimental signals (e.g. see #10253)
- Marking as experimental using comments in the style of grpc-go
  1. We want to have a full picture of any proposal before embarking on it.

As next steps, (speaking personally for this part), I would like to see a design document that lays out the proposal from the Profiling SIG on how to do this, possibly supported by a PoC like the one @dmathieu is working on over at #10307. We don't have a formal structure for Collector-specific design documents but you can check this as a recent example. We don't need to figure out every detail on the RFC; the aim should be to have a high level vision of the proposal in a way that allows us to consider pros and cons.

mx-psi commented 2 months ago

Closed by #10375, we decided not to split by signal but instead do separate modules for experimental signals only

open-telemetry / opentelemetry-collector

Proposal: split signal-aware packages into per-signal subpackages #10207