The dedup config was introduced in the initial implementation of the Elasticsearch exporter for de-duplicating colliding attributes. This is particularly relevant when using the "Raw" or "ECS encoding modes, where attributes are not nested in unique namespaces. e.g. if one were to add a trace.id attribute to a log record, while also setting the top-level TraceID field, then in ECS mode this would lead to duplicate trace.id attributes being set in the Elasticsearch document; enabling the dedup configuration prevents this.
The only conceivable reason for making deduplication configurable in the exporter is when you know for absolutely sure that duplicates cannot occur, so you can save a few CPU cycles. This would be the exception rather than the norm, and I think the value does not justify the complexity the configuration introduces.
I propose we:
deprecate and then remove the dedup config
always deduplicate object keys
I don't think there's any harm in doing these out of order, essentially making the config a no-op.
Component(s)
exporter/elasticsearch
Describe the issue you're reporting
The
dedup
config was introduced in the initial implementation of the Elasticsearch exporter for de-duplicating colliding attributes. This is particularly relevant when using the "Raw" or "ECS encoding modes, where attributes are not nested in unique namespaces. e.g. if one were to add atrace.id
attribute to a log record, while also setting the top-level TraceID field, then in ECS mode this would lead to duplicatetrace.id
attributes being set in the Elasticsearch document; enabling thededup
configuration prevents this.For better or worse, Elasticsearch rejects documents with objects that have duplicate keys: https://github.com/elastic/elasticsearch/issues/19614. This is not configurable.
The only conceivable reason for making deduplication configurable in the exporter is when you know for absolutely sure that duplicates cannot occur, so you can save a few CPU cycles. This would be the exception rather than the norm, and I think the value does not justify the complexity the configuration introduces.
I propose we:
dedup
configI don't think there's any harm in doing these out of order, essentially making the config a no-op.