bakdata / kpops

Deploy Kafka pipelines to Kubernetes
https://bakdata.github.io/kpops
MIT License
12 stars 1 forks source link

Defaults should be distributed across multiple files #55

Closed philipp94831 closed 8 months ago

philipp94831 commented 1 year ago

We should leverage our nested folder structure and allow placing defaults.yaml in each folder. These should be automatically loaded given a pipeline path. No --defaults parameter should be necessary then. Defaults should be merged from top to bottom, i.e., the up-most level is used as a baseline and child defaults are merged afterwards. This means that conflicting keys defined in more specific locations override keys in more general locations.

sujuka99 commented 1 year ago

with pipeline path do you mean KPOPS_PIPELINE_BASE_DIR or KPOPS_PIPELINE_PATH?

What do you mean by this?

our nested folder structure

If I understand correctly, your idea is to allow defaults to be set at all levels between the 2 paths mentioned above and to resolve conflicts based on location. This should apply to both environment-specific and general defaults, right?

Do you also want to allow defaults in subdirectories of KPOPS_PIPELINE_PATH? If yes, then --defaults would still be needed at least in config.yaml or as a flag in the cli.

To me, it seems like this would be cool to also implement for config.yaml and even maybe pipeline.yaml?

philipp94831 commented 1 year ago

@sujuka99 I mean KPOPS_PIPELINE_PATH.

There should not be any defaults in sub folders of the path so I think the --defaults param can be removed. I would not change anything about config.yaml and pipeline.yaml for now

disrupted commented 9 months ago

I wonder how this function should work in the future

https://github.com/bakdata/kpops/blob/6121514ac255d435627cc418fedf7651bf74f984/kpops/components/base_components/base_defaults_component.py#L131-L135

should it take a list of defaults.yaml paths? currently, we have global defaults for each component which makes enrichment straightforward. We consume this function in other applications

philipp94831 commented 9 months ago

I guess passing a list of default is fine. The list of defaults is determined by the pipeline path. The environment defaults are then also just part of that list