elastic / elastic-package

elastic-package - Command line tool for developing Elastic Integrations
Other
49 stars 116 forks source link

Validate mapping ECS compliance in system test #2120

Open flash1293 opened 1 month ago

flash1293 commented 1 month ago

Integrations rely on ecs@mappings to make sure ECS fields are mapped correctly. However, it's possible fields will get with the wrong type in case they are sent with the wrong type in the incoming JSON document (e.g. sending a boolean value as a string "true" will cause the field to be mapped as keyword even though it's specified as boolean in ECS).

To catch these problems early, part of the integration test should be a validation of the generated mappings for all data streams, making sure all fields are actually mapped in compliance with ECS.

While this won't rule out potential mapping issues completely, it should make it much easier to catch these issues during development.

This is part of https://github.com/elastic/observability-dev/issues/3967

cc @zmoog @jsoriano

jsoriano commented 1 month ago

Some validations on field types already happen, but not for all types, and it can be disabled for numeric and string fields with numeric_keyword_fields and string_number_fields.

flash1293 commented 1 month ago

and it can be disabled for numeric and string fields with numeric_keyword_fields and string_number_fields.

Does this validation happen on the source or on the mappings the system ends up with? With ecs@mappings that's not necessarily the same thing.

jsoriano commented 1 month ago

Does this validation happen on the source or on the mappings the system ends up with? With ecs@mappings that's not necessarily the same thing.

For every field in the stored document (fields taken from _source or fields depending on the case), it checks that it has a definition and that the type of the value matches with the type of the definition. It looks for the definitions in the ones included in the package or in ECS for packages including ECS mappings.

We are considering to make validations also based on the mappings the data streams end up with after running system tests (related internal discussion in https://github.com/elastic/ingest-dev/issues/3935), but we don't do anything with it now.