open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.71k stars 2.14k forks source link

[exporter/elasticsearch] deprecate/remove dedup config #33773

Open axw opened 3 days ago

axw commented 3 days ago

Component(s)

exporter/elasticsearch

Describe the issue you're reporting

The dedup config was introduced in the initial implementation of the Elasticsearch exporter for de-duplicating colliding attributes. This is particularly relevant when using the "Raw" or "ECS encoding modes, where attributes are not nested in unique namespaces. e.g. if one were to add a trace.id attribute to a log record, while also setting the top-level TraceID field, then in ECS mode this would lead to duplicate trace.id attributes being set in the Elasticsearch document; enabling the dedup configuration prevents this.

For better or worse, Elasticsearch rejects documents with objects that have duplicate keys: https://github.com/elastic/elasticsearch/issues/19614. This is not configurable.

The only conceivable reason for making deduplication configurable in the exporter is when you know for absolutely sure that duplicates cannot occur, so you can save a few CPU cycles. This would be the exception rather than the norm, and I think the value does not justify the complexity the configuration introduces.

I propose we:

I don't think there's any harm in doing these out of order, essentially making the config a no-op.

github-actions[bot] commented 3 days ago

Pinging code owners: