elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.03k stars 24.83k forks source link

Fields with enabled time_series_dimension don't support multi values in non tsdb index modes #112232

Closed felixbarny closed 2 months ago

felixbarny commented 2 months ago

For OTel mappings, we're using a common definition of attributes and resource attributes for logs, metrics, and traces to avoid duplications and ensure a consistent mapping:

https://github.com/elastic/elasticsearch/blob/73c5c1e1c587cc7ec7ce1f0d10fea49ecfd39002/x-pack/plugin/otel-data/src/main/resources/component-templates/otel%40mappings.yaml#L20-L24

https://github.com/elastic/elasticsearch/blob/73c5c1e1c587cc7ec7ce1f0d10fea49ecfd39002/x-pack/plugin/otel-data/src/main/resources/component-templates/semconv-resource-to-ecs%40mappings.yaml#L9-L15

That's why we set time_series_dimension to true which I had expected to only kick in for TSDB data streams and ignored for others. However, it seems like a validation gets kicked in which disallows attributes to have multiple values (due to https://github.com/elastic/elasticsearch/issues/110387), even in standard and logsdb index modes. As a result, this exception is thrown:

https://github.com/elastic/elasticsearch/blob/73c5c1e1c587cc7ec7ce1f0d10fea49ecfd39002/server/src/main/java/org/elasticsearch/index/mapper/DocumentDimensions.java#L103

I think the reason for that is that even for IndexMode.STANDARD, we return an implementation of DocumentDimensions that only allows single values:

https://github.com/elastic/elasticsearch/blob/73c5c1e1c587cc7ec7ce1f0d10fea49ecfd39002/server/src/main/java/org/elasticsearch/index/IndexMode.java#L107-L109

Maybe we should return a noop implementation for index modes standard and tsdb? Or silently ignore time_series_dimension when the index mode is not tsdb.

We can also discuss other options for how we can compose our mappings so that multi-valued attributes are allowed for logs and traces while ensure consistency across the mappings and not having to duplicate mappings (in particular the ECS alias mappings, which are quite long)

cc @elastic/obs-ds-intake-services @gregkalapos

elasticsearchmachine commented 2 months ago

Pinging @elastic/es-storage-engine (Team:StorageEngine)