elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
841 stars 24.8k forks source link

ecs@mappings: support all date fields when date_detection is disabled #112398

Closed zmoog closed 1 month ago

zmoog commented 2 months ago

Situation

The ecs@mappings component template supports all the fields in ECS.

For date fields, it supports the following naming conventions:

https://github.com/elastic/elasticsearch/blob/212fe035ea4939d10d8a55eaf882163d932ad914/x-pack/plugin/core/template-resources/src/main/resources/ecs%40mappings.json#L138-L162

Problem

The mapping works in all circumstances for the date fields that match the above naming convention.

However, other data fields do not match this naming convention:

threat.indicator.first_seen
threat.indicator.last_seen 
threat.indicator.modified_at
threat.enrichments.indicator.modified_at
threat.enrichments.matched.occurred
threat.enrichments.indicator.first_seen 
threat.enrichments.indicator.last_seen

These fields are generally mapped as dates thanks to the date_detection dynamic field mapping option, enabled by default.

If date_detection is disabled, Elasticsearch will not map these fields as date, creating unexpected mapping problems.

Conslusion

ecs@mappings should support all the data fields in ECS by extending the naming convention, even if integration devs or end users disable date_detection for any reason.

References

eyalkoren commented 1 month ago

When adding these fields, we should update EcsDynamicTemplatesIT as well to run with date_detection: false, to make sure we capture all existing and future fields that are affected by this setting.

eyalkoren commented 1 month ago

These fields are generally mapped as dates thanks to the date_detection dynamic field mapping option, enabled by default.

Note that the last dynamic template in ecs@mappings acts as a fallback to all string values, mapping then to keyword, which is effectively disabling date detection for strings. So what I am not sure about is how come we miss that. Our tests (should) cover all ECS fields, generate mock String values for fields that are mapped to date, index documents with these values and validate that they are mapped correctly.

eyalkoren commented 1 month ago

I can verify that when setting date_detection: false, these fields get the wrong mapping. I think this means that the automatic date detection is applied somehow before all dynamic templates are analyzed against the input fields, which I don't think is intentional. Maybe there is a different explanation to that.

Either way, I proposed a fix to mitigate this issue.

ruflin commented 1 month ago

Here is related issue: https://github.com/elastic/elasticsearch/issues/109381