MrAlias / otel-schema

Playground to prototype and investigate configuration schema proposals for OpenTelemetry
Apache License 2.0
2 stars 7 forks source link

Propose "ideal" configuration #5

Closed codeboten closed 1 year ago

codeboten commented 1 year ago

This issue is to try and produce the configuration that would be ideal from a user ergonomics standpoint

codeboten commented 1 year ago

An example proposal

disabled: false                      # OTEL_SDK_DISABLED
resource:
  attributes:                        # OTEL_RESOURCE_ATTRIBUTES
    - key1: value1
    - key2: value2
service:
  name: myapp                        # OTEL_SERVICE_NAME
log:
  level: info                        # OTEL_LOG_LEVEL
propagators: [tracecontext, baggage] # OTEL_PROPAGATORS
sampler:
  name: parentbased_always_on        # OTEL_TRACES_SAMPLER
  argument: "0.25"                   # OTEL_TRACES_SAMPLER_ARG

processors:
  batch/span:
    delay: 5000      # OTEL_BSP_SCHEDULE_DELAY
    timeout: 30000   # OTEL_BSP_EXPORT_TIMEOUT
    queue_size: 2048 # OTEL_BSP_MAX_QUEUE_SIZE
    export_size: 512 # OTEL_BSP_MAX_EXPORT_BATCH_SIZE
  batch/log:
    delay: 5000      # OTEL_BLRP_SCHEDULE_DELAY
    timeout: 30000   # OTEL_BLRP_EXPORT_TIMEOUT
    queue_size: 2048 # OTEL_BLRP_MAX_QUEUE_SIZE
    export_size: 512 # OTEL_BLRP_MAX_EXPORT_BATCH_SIZE

limits:
  attributes:
    value_length: 0 # OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT
    count: 128      # OTEL_ATTRIBUTE_COUNT_LIMIT
  spans:
    attributes:
      value_length: 0 # OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT
      count: 128      # OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT
    event:
      count: 128 # OTEL_SPAN_EVENT_COUNT_LIMIT
      attributes:
        count: 128 # OTEL_EVENT_ATTRIBUTE_COUNT_LIMIT
    link:
      count: 128 # OTEL_SPAN_LINK_COUNT_LIMIT
      attributes:
         count: 128 # OTEL_LINK_ATTRIBUTE_COUNT_LIMIT

exporters:
  otlp:
    endpoint:
  jaeger:
    protocol: grpc # OTEL_EXPORTER_JAEGER_PROTOCOL
    endpoint: http://localhost:14268/api/traces # OTEL_EXPORTER_JAEGER_ENDPOINT
    timeout: 10000 # OTEL_EXPORTER_JAEGER_TIMEOUT
    user: "" # OTEL_EXPORTER_JAEGER_USER
    password: "" # OTEL_EXPORTER_JAEGER_PASSWORD
  zipkin:
    endpoint: http://localhost:9411/api/v2/spans # OTEL_EXPORTER_ZIPKIN_ENDPOINT
    timeout: 10000                               # OTEL_EXPORTER_ZIPKIN_TIMEOUT
  prometheus:
    host: localhost # OTEL_EXPORTER_PROMETHEUS_HOST
    port: 9464      # OTEL_EXPORTER_PROMETHEUS_PORT
  logging:

python: # OTEL_PYTHON_*

pipelines:
  traces:
    processors: [simple]
    exporters: [logging, jaeger] # OTEL_TRACES_EXPORTER
  metrics:
    processors: [batch]
    exporters: [otlp] # OTEL_METRICS_EXPORTER
  logs:
    processors: [batch]
    exporters: [otlp] # OTEL_LOGS_EXPORTER

instrumentations:
  redis:
    package:
tsloughter commented 1 year ago

I like yours, it is a mix of what we in Erlang originally had for file configuration:

[
 {opentelemetry,
  [{processors, [{otel_batch_processor,
                  #{exporter => {opentelemetry_exporter, #{protocol => grpc}}}
                 }]
   }]}
]

And what we have now (while supporting the original:

%% config/sys.config.src
[
 {opentelemetry,
  [{span_processor, batch},
   {traces_exporter, otlp}]},

 {opentelemetry_exporter,
  [{otlp_protocol, http_protobuf},
   {otlp_endpoint, "http://localhost:4318"}]}]}
].

It makes me think there may be even another level of mixture with the kitchen-sink examples I can do ... where you still have to define stuff like tracer providers but have less nesting by supporting top level definitions of stuff like exporters by giving each a user defined name they can use within the tracer provider definition.

codeboten commented 1 year ago

supporting top level definitions of stuff like exporters

Right, my example is strongly influenced by the collector's configuration where components individually, and the telemetry use those definitions via the names used as identifiers. This has the bonus that folks familiar with configuring the collector would find configuring SDKs fairly straightforward.

jack-berg commented 1 year ago

While I'd like to have symmetry with the collector where possible, SDK processors and exporters are conceptually different than collector processors and exporters. Specifically, SDK exporters don't show up in the configuration of tracerprovider / meterprovider / loggerprovider except as arguments for specific built in processors.

Let me try to illustrate through some examples: Example 1 What do you do when exporters are configured without a processor?

pipelines:
  traces:
    processors: []
    exporters: [logging, jaeger]

Example 2 What does it mean to have a single processor and multiple exporters? Maybe that a composite exporter should be configured which calls both the logging and jaeger exporters?

pipelines:
  traces:
    processors: [simple]
    exporters: [logging, jaeger]

Example 3 How do I configure two batch processors with different exporters? Two options are shown, and neither is intuitive:

# Create two separate trace pipelines, each with one batch processor and an exporter. This breaks down because it gives the false impression that you can have multiple pipelines in a single tracer provider.
pipelines:
  traces/exp1:
    processors: [batch/exp1]
    exporters: [exporter1]
  traces/exp2:
    processors: [batch/exp2]
    exporters: [exporter2]
---
# Add two different batch processors to a single pipeline. How do I indicate which exporter should be associated with each batch processor?
pipelines:
  traces:
    processors: [batch/exp1, batch/exp2]
    exporters: [exporter1, exporter2]

Additionally, meterprovider is quite different from loggerprovider and tracerprovider - it doesn't have the notion of a processor at all - only metric readers.

codeboten commented 1 year ago

Example 1 What do you do when exporters are configured without a processor?

In this case I would omit the processors section (this is what the collector does)

pipelines:
  traces:
    exporters: [logging, jaeger]

Example 2 What does it mean to have a single processor and multiple exporters? Maybe that a composite exporter should be configured which calls both the logging and jaeger exporters?

In practice, the SDK would have to configure multiple span processors for each configured exporter

Example 3 How do I configure two batch processors with different exporters? Two options are shown, and neither is intuitive:

Wouldn't configuring different processors w/ different exporters effectively be configuring multiple pipelines?

I agree that processor configuration doesn't make sense for metrics, not sure if there is a term that could be used other than processors to generalize the configuration.

jack-berg commented 1 year ago

In this case I would omit the processors section (this is what the collector does)

I think that would have to configure effectively a noop tracer provider configuration, since exporters are meaningless without an associated processor to feed them data.

In practice, the SDK would have to configure multiple span processors for each configured exporter

Example 3 illustrates the challenges with this.

Wouldn't configuring different processors w/ different exporters effectively be configuring multiple pipelines?

Its conceptually different than two pipelines. If there were actually two pipelines, each pipeline would have its own set of processors. In the SDK, there is only one pipeline and all processors invoked. So if you had two batch processors each with an exporter, then added an additional processor that did some enrichment (i.e. enrich with baggage), the changes from the additional processor would be seen by both batch processors. There's no way to limit isolate the additional processor's changes to only a single batch processor.

tsloughter commented 1 year ago

Pipelines aren't a concept in the API or SDK. I think configuring providers is confusing enough without adding pipeline :)

codeboten commented 1 year ago

I've added a PR to make discussions around the specifics of the config a bit easier.

codeboten commented 1 year ago

Closing this issue, there is a working configuration here

jack-berg commented 1 year ago

A few things are still missing:

Samplers are the most interesting case because they may delegate to each other.