open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.95k stars 2.29k forks source link

[exporter/elasticsearch] record attributes should take precedence over resource attributes for dynamic index mode #33725

Open andrzej-stencel opened 3 months ago

andrzej-stencel commented 3 months ago

Component(s)

exporter/elasticsearch

otelcol-contrib v0.103.0

Is your feature request related to a problem? Please describe.

Current situation

When sending data with the Elasticsearch exporter using the ecs mapping mode, the record level attributes (log record/data point/span) take precedence over resource level attributes. This seems to make sense. Telemetry creators can define a value for a set of telemetry in the resource attribute and then override this value for specific items in the resource.

The problem

When using the dynamic index feature (e.g. logs_dynamic_index::enabled = true), the order of precedence for the attributes elasticsearch.index.prefix and elasticsearch.index.suffix is reversed: resource attributes take precedence over record attributes. I'm not sure what the rationale for this is.

Describe the solution you'd like

I propose to change the order of precedence for the dynamic indexing attributes so that record-level attributes (log record/data point/span) take precedence over the resource attributes. This would be a breaking change, it might need to be introduced via a feature gate.

Describe alternatives you've considered

I think using the Group By Attributes processor before the Elasticsearch exporter is a possible workaround, as it causes the record level attributes to overwrite the resource attributes in case of a conflict (see the example in the processor's README to see how record value host.name=host-A replaces the resource-level host.name=localhost).

The problem with this workaround is that it's resource-intensive and changes the original structure of the telemetry, which might be undesired.

Additional context

No response

github-actions[bot] commented 3 months ago

Pinging code owners:

felixbarny commented 3 months ago

@axw I wonder if you have input on this as you've been advocating for resource attributes to have the highest precedence.

The most important bit for me is that the precedence when doing data stream routing should be consistent with the precedence for which attribute ends up in the ES document when it is defined at multiple levels.

axw commented 3 months ago

Hrm, it's not as clear-cut as I would like. The reason I have advocated for that order of precedence in the past is to maintain some clarity around where things like service.* attributes can be set (i.e. in resource attributes). If we allow signal/record-level attributes to override them, then processors all of a sudden need to start looking at the record level for these attributes too.

I don't think that holds for data_stream.* though, which isn't specifically related to a resource.

felixbarny commented 3 months ago

That makes sense. Maybe we should then generally treat resource attributes as having the highest precedence and make an exception to that rule for data_stream routing. The source of truth for these are always the top-level data_stream.* fields, even when the mapping mode is otel or none. So there's already special-treatment for these fields and it feels less random that there's a special rule for these kind of attributes.

axw commented 3 months ago

@felixbarny :+1: I think that makes sense

github-actions[bot] commented 1 month ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.