elastic / package-spec

EPR package specifications
Other
17 stars 70 forks source link

[Change Proposal] Define reusable fields to use in the build dependency system #746

Open pzl opened 3 months ago

pzl commented 3 months ago

Problem Statement

The endpoint package defines many custom fields (not part of ECS, and many are specific to this process, and wouldn't be good candidates for becoming ECS fields), and many of those custom fields are used across 5+ data stream definitions.

The endpoint package currently uses the ecs build tooling to generate our data stream definitions. In order to migrate to use the elastic-package tool for building, those field definitions (field type, description, etc) would need to be repeated in each of our data streams. This would be a prohibitive amount of duplication to maintain.

For pure ecs fields, that duplication is avoided by the external dependency system to import the field.

Using a similar system to define our fields in one place and "import" them via this dependency system would solve that duplication problem.

Proposal

The dependency system is extended to be able to import field definitions from a local folder as another source.

chrisberkhout commented 2 months ago

Another example of duplication in field definitions is between a data stream and a related transform.

For example, for the OpenCTI integration has the same field definitions in its data stream and in its latest_ioc transform.


In such cases, I've tried to have duplicated definitions in files with the same file name and content, so I can automate checks that they they haven't diverged:

diff -c9 <(cd data_stream/indicator/fields/              > /dev/null; md5sum *) \
         <(cd elasticsearch/transform/latest_ioc/fields/ > /dev/null; md5sum *)

One idea I've been thinking about is using symbolic links to refer to field and pipeline definitions that would otherwise be duplicated. Symbolic links work well in Git, and on OSX and Linux. They are also supported on Windows, with appropriate settings (see explanations of the current support, and the earlier situation).