kubernetes-sigs / kubespray

Deploy a Production Ready Kubernetes Cluster
Apache License 2.0
15.88k stars 6.4k forks source link

Optimize logic of k8s-certs-renew.sh script #11164

Open cqmmm opened 4 months ago

cqmmm commented 4 months ago

What would you like to be added

When we deploy the cluster, we set auto_renew_certificates to true, and the renewal process can be seen through systemctl list-timers. From the setting of Timer OnCalendar, the k8s-certs-renew.sh script is executed once a month.

image

We found that after kubeadm certs check-expiration in the script, it directly executes renew and then rebuilds the control plane pods (kube-apiserver, etc.). Is this the best practice? Would it be more appropriate to add a logical judgment: after executing kubeadm certs check-expiration, determine how long the certificate will expire based on the output, and execute renew when it is close to expiration.

image

Why is this needed

Frequent renewals imply multiple restarts of control plane components, which we believe carries some risk.

ErikJiang commented 4 months ago

You can adjust the frequency of certificate renewal using the auto_renew_certificates_systemd_calendar parameter.

Payback159 commented 4 months ago

Hi @cqmmm ,

I've already looked at this too. The problem is that kubeadm for the certs check-expirations command currently does not provide a good output to recognize the residual time of the certificates. You would have to parse a few values with awk, for example, and with the next output change you would have to check again and again whether the check is still valid.

I also checked and apparently kubeadm is already working on supporting other output-formats like yaml or json (this would make parsing the information easier), but as @ErikJiang already mentioned, your problem would probably be solved by configuring auto_renew_certificates_systemd_calendar.

As an example you could use * *-01,07-01 00:00:00, then the script would only run every 6 months. In my test case, the kubeadm certificates are valid for 364d. This means that it would be renewed twice as often as necessary and you are 100% sure that the certificates will not expire before the script renews them.

Payback159 commented 4 months ago

Maybe a logic extension of the script makes sense if kubespray runs on kubeadm v1.30 (I think this also implies the Kubernetes version v1.30).

Since kubeadm apparently supports the structured outputs with v1.30 for kubeadm certs commands. https://github.com/kubernetes/kubernetes/pull/123372

champtar commented 4 months ago

If you think this is risky, you should run it more often, not less, but on your schedule, so you find out what fails and fix it. Given the lack of negative feedback over the years I personally think it's safe.

MrFreezeex commented 4 months ago

Also note that before this option even existed, the Kubespray stance was that people should be upgrading at least once a year to stay on supported version so this was somewhat not needed. So if your use case fit that (i.e.: you didn't reduce the cert lifetime), you could still fallback to not using this option at all.

Although if you find a way to improve the existing logic feel free!

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Cassandrahan commented 2 weeks ago

You can adjust the frequency of certificate renewal using the auto_renew_certificates_systemd_calendar parameter.

Are there any best practices?