elastic / elastic-integration-corpus-generator-tool

Command line tool used for generating events corpus dynamically given a specific integration
Other
22 stars 12 forks source link

Should we allow overriding packages values, and if yes, how? #49

Open endorama opened 1 year ago

endorama commented 1 year ago

There is no need to override values the generation configuration at the moment, but when work on adding this generation to elastic-package will be completed such overrides may be needed.

Once this tool will be leveraged though elastic-package will consume packages as their configuration source.

There are 4 main scenarios that I foresee:

  1. generation is triggered for a package@version; this is the simple case, package is downloaded, file extracted, generator configured
  2. generation is triggered for a package@main; in this case we want to target an unreleased package in the integrations repo
  3. generation is triggered for a package@branch
  4. generation is triggered for a local package not yet committed nor pushed to integrations repo

Which of these cases (other than n 1) will we support?

/cc @aspacca

ruflin commented 1 year ago

I expect the first use case to be 4. It can indirectly be used to do 1-3 if needed for automation.

aspacca commented 1 year ago

4. generation is triggered for a local package not yet committed nor pushed to integrations repo

if we rephrase this as generation is triggered for a package@commmit then maybe we can change the scope to starting the package registry locally and be able to fetch from different package registries

I have to look in the existing elastic-package codebase in order to understand what the tool already offers in terms of fetching an integration locally/from an alternative package registry

this is something that anyway should be taken care of by the elastic-package generate command, not the by the generator tool

endorama commented 1 year ago

generation is triggered for a package@commmit

Where would it be committed though? Local branch? Do we want to require a package-registry in all cases? In my experience what elastic-package does with build and stack up is not that reliable and adds a lot of additional complexity (restarting the registry for example). With local I was sort of implying "not from a registry", to allow for running it against unpublished modification (I foresee this useful for testing/experimenting/support, not for other use cases).

this is something that anyway should be taken care of by the elastic-package generate command, not the by the generator tool

Would this mean that the tool always reads from filesystem and how file get there is not this tool concern? I think that would be a great simplification. but currently this CLI generate command actually reads already from a package registry.

aspacca commented 1 year ago

Where would it be committed though? Local branch? Do we want to require a package-registry in all cases?

I was expecting that elastic-package already manages running some of its commands from a local branch commit

In my experience what elastic-package does with build and stack up is not that reliable and adds a lot of additional complexity (restarting the registry for example).

I was not aware of that :) I'd would be more incline to fix/improve eventual problems/complexity present in elastic-package for a "feature" that's already present there, rather than implementing the equivalent feature in the corpus generator in order to overcome the limits in elastic-package

Would this mean that the tool always reads from filesystem and how file get there is not this tool concern? I think that would be a great simplification. but currently this CLI generate command actually reads already from a package registry.

either from filesystem or memory: I would not mix the use of the generator as CLI tool and as a package to be consumed. the CLI generate command, indeed reads from a package registry, but this is just specific to CLI command behaviour: calling the same method in the genlib package that the CLI generate command calls, you have to provide Config and Fields. and it's up to you how to generate/fetch them (see horde for example)

that's another reason, for me, to keep all these related operations in elastic-package

ruflin commented 1 year ago

if we rephrase this as generation is triggered for a package@commmit then maybe we can change the scope to starting the package registry locally and be able to fetch from different package registries

I still struggle why this would be needed? As a dev, I'm building a package and made some changes to the ingest pipeline, templates. Before I commit anything, I want to generate data and test with it?

aspacca commented 1 year ago

I still struggle why this would be needed? As a dev, I'm building a package and made some changes to the ingest pipeline, templates. Before I commit anything, I want to generate data and test with it?

the package@commit was indeed an improper definition. I wanted to mean exactly the use case you described, with a local change to a package, that to my knowledge is a case that elastic-package already handles.

my opinion is to rely (and eventually improve) on what elastic-package already provides in order to manage such case, rather than adding anything to do the same directly in the corpus generator