kubernetes-sigs / kustomize

Customization of kubernetes YAML configurations
Apache License 2.0
10.7k stars 2.22k forks source link

Apply KRM functions in batches #5673

Open Homulvas opened 2 months ago

Homulvas commented 2 months ago

Eschewed features

What would you like to have added?

Following the suggestion in https://github.com/kubernetes-sigs/kustomize/issues/5173 we have implemented our own resource transformers. While this generally works we have run into a performance issue where for a big enough target each transformer takes maybe a second or two to process. Now this alone wouldn't be that bad but each additional transformer impacts the runtime linearly. This leads to cases where the time spent applying the transformers is the majority of the whole build.

Why is this needed?

Piping the complete input/output for each KRM function separately is inefficient and makes builds very slow for big enough targets.

Can you accomplish the motivating task without this feature, and if so, how?

One possible workaround is to have all transformations inside a single transformer file. However, this makes the transformers hard to use as you have to treat them with special care.

What other solutions have you considered?

A workaround has been described above.

Anything else we should know?

No response

Feature ownership

koba1t commented 4 days ago

Hi @Homulvas Thanks for submitting the feature request. I understand what your problem is.

I'm so sorry, but I can't understand what you think about batches at your request.

My guess is that your transformer takes a few minutes to start, and you want to process many yamls all at once. Isn't that right?

/triage needs-information

Homulvas commented 3 days ago

I'm so sorry, but I can't understand what you think about batches at your request.

I may have worded the request poorly. The crux of the issue is that a lot of unnecessary I/O is done when there are multiple transformations.

My guess is that your transformer takes a few minutes to start, and you want to process many yamls all at once. Isn't that right?

The issue is that it's always virtually the same yaml. You read a big yaml input, apply a small transformation, write the output. For each separate transformation only the transform step should be repeated.

koba1t commented 2 days ago

Sorry, I didn't understand what you want now. What do you think about the problem?

Do you care about the call overhead of a custom transformer?

I think the current custom transformer interface looks like can batch processing. https://github.com/kubernetes-sigs/kustomize/blob/e3a7615ccb84506cee74e576d05e636aaa4542ad/kyaml/fn/framework/framework.go#L16-L58