opensearch-project / data-prepper

OpenSearch Data Prepper is a component of the OpenSearch project that accepts, filters, transforms, enriches, and routes data at scale.
https://opensearch.org/docs/latest/clients/data-prepper/index/
Apache License 2.0
262 stars 201 forks source link

Support OpenTelemetry OTLP/HTTP as addition to OTLP/gRPC #4983

Open KarstenSchnitter opened 1 month ago

KarstenSchnitter commented 1 month ago

Is your feature request related to a problem? Please describe. The current OpenTelemetry specification describes two transport channels:

All formats use the same protobuf scheme. OTLP/HTTP is currently not supported by DataPrepper. The OpenTelemetry specification has changed since the original implementation of the DataPrepper OTel sources. It now recommends OTLP/HTTP + protobuf to be the default protocol per opentelemetry-specification#1885. Many OpenTelemetry SDKs nowadays use OTLP/HTTP that way and not all OpenTelemetry instrumentations even support OTLP/gRPC. Connecting such solutions to DataPrepper is not possible without a protocol translation.

Describe the solution you'd like DataPrepper should add support for OTLP/HTTP to the Otel*Sources. The configuration should enable a selection of the protocol to be supported. Ideally, it is possible to use both protocols simultaneously. This enables connecting different services to the same DataPrepper instance.

Describe alternatives you've considered (Optional)

  1. The OpenTelemetry Collector can be used to translate between OTLP/gRPC and OTLP/HTTP and vice versa. However, this requires an additional component in the signal stream. Direct support by DataPrepper would be a better approach.

  2. The HTTP source could be used for the OTLP/HTTP with JSON format. It would require a complex JSON parsing configuration due to the nested arrays in the OTel data structures. In this configuration, the data would also not pass through the OTel processors easily.

Additional context There has been a previous issue about OTLP/HTTP support in the OpenDistro project: https://github.com/opendistro-for-elasticsearch/data-prepper/issues/283.

KarstenSchnitter commented 1 month ago

There is a comment on how to use OTLP/HTTP with trace in https://github.com/opensearch-project/data-prepper/issues/1152#issuecomment-1062293463. It introduced the unframed_requests options:

otel_trace_source:
      ssl: false
      unframed_requests: true

This option is only available for the OTelTraceSource though. It is not supported for logs and metrics. A quick win would be to provide this options for those signals, too.

dlvenable commented 1 month ago

@KarstenSchnitter , To be sure, is the unframed_requests what we need, or is there something else?

Would you be able to add these to OTel Logs and Metrics?

KarstenSchnitter commented 1 month ago

The unframed_requests seem to be a good stating point. However, this approach still needs special configuration by the sender. I asked @TomasLongo to take a look at this. We will investigate this issue and come up with some recommendations on what needs to change. As far as I understand armeria, the intended way would be to define a specific server for OTLP/HTTP and not reutilise the gRPC endpoint.

Within OpenTelemetry OTLP/gRPC uses default port 4317 and OTLP/HTTP uses default port 4318. However, due to the vastly distinct paths it is possible to run both protocols at the same endpoints easily. We need to discuss, whether there should be a configuration parameter to select the protocol or whether Data Prepper should always provide both.

AdaptiveStep commented 1 month ago

I think the OtelCollectors "opensearch exporter" sends things via http instead of using grpc websockets. Have you tried it?

https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/opensearchexporter

KarstenSchnitter commented 1 month ago

@AdaptiveStep thanks for mentioning this. My team is using OpenSearch to provide a managed observability service. We want to provide OTLP/HTTP support next to the OTLP/gRPC support, we already offer with Data Prepper. We do not want to run an additional OTel Collector to translate the protocol or ingest data. We have tried the OTel Collector, but like Data Prepper better.