Closed lentzi90 closed 1 year ago
@lentzi90: The label(s) /label area/control-plane
cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor
. Is this label configured under labels -> additional_labels
or labels -> restricted_labels
in plugin.yaml
?
/triage accepted
Great finding!
Thank you very much.
Sounds reasonable.
I think we can either extend the ClusterCacheTracker in some way to store the etcd client credentials or we have roughly a smaller copy of it just for etcd certs in KCP.
I slightly lean towards the first option.
The tricky parts are the things we already solved in ClusterCacheTracker:
It's not super trivial to implement this without missing something, which is why I lean towards extending the ClusterCacheTracker.
@vincepri Opinions?
Now that I'm looking at it again. There's probably also a way to avoid doing this 4x in the same Reconcile call (which should already improve it by 4x)
This is a great finding, kudos to @lentzi90!
WRT to the solution, I'm ok with storing the etcd client cert in the cluster cache tracker, even if I will keep those separated from the clusterAccessor so we are not recreating certificates every time there is a connection issue.
WRT to the solution, I'm ok with storing the etcd client cert in the cluster cache tracker, even if I will keep those separated from the clusterAccessor so we are not recreating certificates every time there is a connection issue.
I think that's where it becomes complicated. If we don't store it in the clusterAccessor we need separate health checking for etcd client certs and additional logic to remove them when the Cluster is deleted. Without health checking we would never re-create the certificates as soon as they become invalid.
But an alternative could also be to just store them in some map with ttl and re-create them when we run into a connection error. Might be a lot simpler.
But I think we need some research there to see what exactly our options are.
Ensuring we only generate certs once per reconcile sounds more straightforward and would already give us a 4x improvement.
Agreed we need some research (and on the caveats you listed above). I will be happy to take a stub to it as soon as I manage to trim down my backlog to something reasonable (hopefully by EOW) If anyone gets to this before me, feel free to pick this up
/help
@fabriziopandini: This request has been marked as needing help from a contributor.
Please ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help
command.
/assign
I have created a minimal fix using the ClusterCache tracker to store the private key used in generatedClientCert, so the most consuming step in the above flame graph will be executed once per cluster instead than 4 times for each reconcile.
@sbueringer PTAL when you have some bandwidth @lentzi90 it will be great if you can validate this PR (if you are not using main, you can cherry-pick the commit on top of your version)
Thanks for the patch @fabriziopandini ! I have verified it and commented on the PR directly with findings :slightly_smiling_face:
What steps did you take and what happened?
When scaling the number of workload clusters, the KCP controller stands out as extremely CPU hungry. I have been investigating this a bit closer now and found some hints about why this is. This is what I did:
I took profiling samples using this command:
I also created a simple dashboard in grafana to check CPU usage and workqueue metrics.
Already at 10 clusters the KCP controller has a much higher CPU usage than the other controllers, as seen here:
The flame graph shows quite clearly what it is about I think. Here is the profile.pb.gz, in case you want to investigate more.
We generate new client certificates all the time. This is a quite CPU intensive operation. For large numbers of KCPs it becomes crazy!
What did you expect to happen?
I'm hoping the KCP controller could be optimized to reduce the CPU requirements for larger numbers of clusters. Probably this will require some kind of cache for re-using client certificates used for connecting to the workload clusters. There is a TODO comment along these lines in the code that I found also: https://github.com/kubernetes-sigs/cluster-api/blob/7edbaf04263c77177ae68f4d64115d207fadb7eb/controlplane/kubeadm/internal/cluster.go#L104
Cluster API version
Main branch (546f46464da13a98b87d469924b407fafe088df8)
Kubernetes version
Anything else you would like to add?
Related issues:
7308
8052
Label(s) to be applied
/kind bug /label area/control-plane