open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.43k stars 1.46k forks source link

Append components flag #8754

Open djaglowski opened 1 year ago

djaglowski commented 1 year ago

Is your feature request related to a problem? Please describe. When merging configurations, it would be helpful to be able to do the following:

For example, consider the following configs.

# main.yaml
receivers:
  otlp/in:
processors:
  batch:
exporters:
  otlp/out:
extensions:
  file_storage:

service:
  pipelines:
    traces:
      receivers: [ otlp/in ]
      processors: [ batch ]
      exporters: [ otlp/out ]
  extensions: [ file_storage ]
# extra_receiver.yaml
receivers:
  foo:

service:
  pipelines:
    traces:
      receivers: [ foo ]
# extra_extension.yaml
extension:
  bar:

service:
  extensions: [ bar ]

If a user provides the above configs in the same order, it will clobber service::pipelines::traces::receivers and service::extensions, effectively removing the otlp/in receiver and file_storage extension from the configuration.

Describe the solution you'd like I propose that these lists are components are a special case and that we should provide the option to append to them, rather than clobber them.

Specifically I propose that we add a flag to the collector, e.g. --append-components or --merge-components-mode=append, which will enable this behavior when merging configs.

Describe alternatives you've considered The same idea could be applied at a more granular level. A simple option would be to have a similar series of flags, e.g. --append-receivers, ...

Additional context Several recent SIG meetings have included some discussion of idea along these lines. Additionally, some issues may be related.

pavolloffay commented 8 months ago

+1 on the proposal.

Are there any updates on this?

tigrannajaryan commented 6 months ago

Specifically I propose that we add a flag to the collector, e.g. --append-components or --merge-components-mode=append, which will enable this behavior when merging configs.

Why does this need to be an option? Is there a case where appending the pipeline lists is undesirable?

Also, the list-appending logic should be name-aware so that if the same-named component is in both input lists it only appears once in the destination list.

djaglowski commented 6 months ago

Why does this need to be an option? Is there a case where appending the pipeline lists is undesirable?

I'm not sure if there are cases where overwriting the list is actually preferred but it is the current behavior. Therefore, I proposed this as opt-in behavior in order to respect backwards compatibility.

If there is consensus, we could just implement this with a feature gate. This would be equivalent to the initial proposal while the gate is alpha. Then when we move it to beta we'd presumably hear if anyone is relying on the current behavior, while still leaving them a way to opt out.

Also, the list-appending logic should be name-aware so that if the same-named component is in both input lists it only appears once in the destination list.

Agreed

VihasMakwana commented 2 months ago

@djaglowski are we actively seeking contributions for this enhancement? I have encountered this limitations many times while using multiple configs.

djaglowski commented 2 months ago

@VihasMakwana, IMO it would be a nice contribution but I would suggest that the @open-telemetry/collector-maintainers should weigh in on this.

andrzej-stencel commented 2 months ago

In my opinion the possibility of replacing lists (as it works today) is still valuable and I don't want to remove it by only providing the possibility to append (add) components to lists.

Replacing is valuable in customization of pipelines that may have some default behavior for users to modify. Examples:

I agree the possibility to append (add) components to lists is valuable. Ideally we should come up with a solution that makes it possible to do both/either, even in the same configuration file. User might want to append to the list of extensions, but replace the list of receivers/processors/exporters in a pipeline. Or append to receivers in one pipeline, but replace the list of receivers in another pipeline.

One simplistic solution that would allow this is to add new list properties like add_extensions, add_receivers, add_processors, add_exporters beside extensions, receivers, processors, exporters. If the user specifies add_extensions, the extensions listed in it would be added to the existing list. If the user specifies extensions, existing behavior is used (replacing the list). User can only specify either one or the other in a single configuration file. I'm not sure if this would be easy to implement or if it wouldn't make the configuration too complicated for users. Example:

base.yaml:

service:
  pipelines:
    logs/1:
      receivers:
        - otlp
      processors: [...]
      exporters:
        - otlp
    metrics/1:
      receivers:
        - otlp
      processors: [...]
      exporters:
        - otlp

user.yaml:

service:
  pipelines:
    logs/1:
      receivers:
        - otelarrow
      add_exporters:
        - otelarrow
    metrics/1:
      add_receivers:
        - prometheus
      exporters:
        - debug

The effective configuration from running otelcol-contrib --config=base.yaml --config=user.yaml would be:

service:
  pipelines:
    logs/1:
      receivers:
        - otelarrow
      processors: [...]
      exporters:
        - otlp
        - otelarrow
    metrics/1:
      receivers:
        - otlp
        - prometheus
      processors: [...]
      exporters:
        - debug
VihasMakwana commented 2 months ago

@andrzej-stencel I agree with your perspective. Both merging strategies have their merits.

Ultimately, the choice should be left to the user, either through a configuration setting or a feature gate. Additionally, the existing merging strategy should continue to be the default option.


It seems we can leverage the existing merging logic from this snippet used in contrib repo, which currently handles the merging of extensions. However, it should be modified to be name-aware.

We can extend this approach to also handle the merging of processors, receivers, and exporters in a similar manner.

I request @open-telemetry/collector-maintainers to provide their feedback on this issue.