Improve function execution time

kptdev / kpt

Automate Kubernetes Configuration Editing

https://kpt.dev

Apache License 2.0

1.71k stars 227 forks source link

Improve function execution time #2469

Open howardjohn opened 3 years ago

howardjohn commented 3 years ago

Describe your problem

Pipeline execution times are longer than expected for simple functions. For example, gcr.io/kpt-fn/set-namespace:v0.1 against a folder with a cert-manager installation manifest (including large CRDs) takes about 5s on my machine.

This includes roughly:

1.7s doing docker pull
1s in copyCommentsAndSyncOrder, because we fetch the open api schema for each resource
1.7s to actually run the function. docker run alpine even takes ~0.7s on my machine.
0.3s doing misc tasks

Opening this issue to track performance improvements. So far it seems like changing the default image-pull-policy to match Kubernetes behavior would be a big contributor; copyCommentsAndSyncOrder seems like another possibility for optimization

howardjohn commented 3 years ago

Seems like 10% of CPU time (ie not including docker run time) is spent on reflect.DeepEqual on spec.Schema, always comapring it to spec.Schema{}. Maybe a smarter IsEmpty could improve this

howardjohn commented 3 years ago

profile.tar.gz profile of set-namespace running. 37% (0.5s) on IsCertainlyClusterScoped is suspicious. I don't even think we actually use the result of that function...

howardjohn commented 3 years ago

Dropping IsCertainlyClusterScoped cuts execution time from 2.3s to 0.7s. Big improvement! But we do use it in GetMatchingResourcesByCurrentId for resid.ResId.Equals. Seems like something we could improve though.

If we really need to read the full openapi.parseBuiltinSchema, maybe we can prebuild a go struct (autogenerated?) instead of parsing json.

Or just precompute a list of GVK -> bool (cluster scoped or not). A test can keep them in sync

bgrant0607 commented 3 years ago