open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.44k stars 1.46k forks source link

Attributes for component instancing #11179

Closed djaglowski closed 1 month ago

djaglowski commented 1 month ago

Is your feature request related to a problem? Please describe.

Component instancing is complicated. Since it seems unlikely that we will change this prior to 1.0, I want to raise the broader discussion of how the collector's own telemetry should ascribe telemetry to component instances.

First, why should we ascribe telemetry to component instances rather than to component configurations? As an observability tool it is critical that we are a good exemplar of an observable system. If we cannot accurately represent the internal state of the collector, I believe we will have failed to accomplish this goal.

In practice, what is most important is that instances derived from the same configuration may have very different states.

Describe the solution you'd like

Many components utilize global state to override the default instancing behavior (i.e. sharedcomponent or similar), but to keep things simple, I will first propose an attribution schema which ignores this problem.

Ignoring shared instances

The following schema provides sufficient information to ascribe telemetry to specific instances. Since each kind of pipeline component (receivers, processors, exporters, connectors) has its own instancing rules, there are separate sets of attributes proposed for each kind.

Receivers

Processors

Note: otel.signal is redundant because pipeline contains the same info. Is it worth keeping for consistency with receivers and exporters, even if it is not consistent with connectors?

Exporters

Connectors

Note: otel.signal as used in the other component kinds is insufficient even for metrics which describe ingoing or outgoing telemetry. This is because multiple instances of a connector may have the same type of incoming or outgoing telemetry. Moreover, the set of attributes should be sufficient for all telemetry describing the instance. This leaves two options:

  1. Separate incoming and outgoing attributes:

    • connector: The component ID
    • otel.incoming_signal: logs, metrics, or traces
    • otel.outgoing_signal: logs, metrics, or traces
  2. otel.signal with special values:

    • connector: The component ID
    • otel.signal: logs->logs, logs->metrics, logs->traces, metrics->logs, metrics->metrics, etc

Acknowledging shared instances

As far as I am aware, all cases of shared instances are implemented using a singleton pattern. The following schema assumes that this is universally the case.

Receivers

Processors

Exporters

Connectors

  1. Separate incoming and outgoing attributes:

    • connector: The component ID
    • otel.incoming_signal: logs, metrics, OR ALL
    • otel.outgoing_signal: logs, metrics, OR ALL
  2. otel.signal with special values:

    • connector: The component ID
    • otel.signal: logs->logs, logs->metrics, logs->traces, metrics->logs, metrics->metrics, etc, OR ALL

Acknowledging non-singleton shared instances

It is worth noting that there is nothing that explicitly prevents component authors from overriding instancing with non-singleton patterns. Deriving some examples from existing components, one could imagine an otlp receiver which allows users to assign signals to ports so that e.g. logs and traces are accepted on one port while metrics are accepted on another. Another example is a memory limiter processor which limits memory across only a single signal type (so would need to be shared across multiple pipelines, but not all pipelines).

I believe the solution to handling this would have to be based on a "list of otel.signal" attribute:

Describe alternatives you've considered

Additional context Work on processor metrics has progressed recently but when considering which attributes to apply to processors, it raises the question of whether the same attributes may apply to other component kinds. For example, the recently added otel.signal attribute may be useful for receivers and exporters, but insufficient for connectors.

djaglowski commented 1 month ago

My preference is to acknowledge shared components, but assume they are singleton instances. For connectors, I prefer otel.signal with a special set of values. For processors, I prefer keeping both pipeline and otel.signal (so that the latter is universal across component kinds).

Receivers

Processors

Exporters

Connectors

mx-psi commented 1 month ago

After discussing on the 2024-09-18 Collector stability meeting, I support this schema

djaglowski commented 1 month ago

Closing in favor of #11343