kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.52k stars 1.3k forks source link

Automatic CA rotation in CAPI #7721

Open furkatgofurov7 opened 1 year ago

furkatgofurov7 commented 1 year ago

User Story

As a [developer/user/operator] I would like to rotate a k8s cluster CA which involves many steps and restarts (rolling upgrade) of pods and updates on other resources (config maps, secrets, service accounts) which is manual: https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/ With CAPI and the ability to deploy many target clusters from a management cluster, I am looking for available options to do the CA rotation at scale (manual operation on each cluster will be very costly). So, it would be interesting to know how the community is addressing this issue. Are there any external open-source tools that could be used to tackle this challenge?

Detailed Description There are also some cases in which the CA of the target clusters might be different from that of the management cluster.

Some use cases:

[A clear and concise description of what you want to happen.] Possible ways to do CA rotation at scale with CAPI built-in support would be ideal

Anything else you would like to add: Checked the automatic cert rotation for control plane machines only introduced in https://github.com/kubernetes-sigs/cluster-api/pull/6983 which essentially tackles the part of the original issue on certificate management in https://github.com/kubernetes-sigs/cluster-api/issues/5490

[Miscellaneous information that will assist in solving the issue.]

/kind feature

furkatgofurov7 commented 1 year ago

Tagging folks who were involved in the referenced issues/PRs and see how we can move with this issue /cc @fabriziopandini @ykakarap @sbueringer

fabriziopandini commented 1 year ago

/triage accepted I think this will require a proposal...

Just as a historical note, this was one of the use cases for which we discussed the idea of a kubeadm operator, which never caught traction.

furkatgofurov7 commented 1 year ago

I think this will require a proposal...

Just as a historical note, this was one of the use cases for which we discussed the idea of a kubeadm operator, which never caught traction.

@fabriziopandini thanks, have not heard about the kubeadm operator before, will look around for that (or if you could share any references would be also great) to grasp the initial idea

furkatgofurov7 commented 1 year ago

Oh found it, maybe this one: https://hackmd.io/@QlB2bmbhS-aeuDlwOCH9Yw/HkidAVXlS

furkatgofurov7 commented 1 year ago

Found out that there is a wider interest on this and https://github.com/kubernetes-sigs/cluster-api/issues/7044 is also gathering the use cases related to this problem

furkatgofurov7 commented 1 year ago

cc @smoshiur1237

Zhupku commented 1 year ago

Hi @furkatgofurov7 and @fabriziopandini,

I'm developing a product based on CAPI, I would like to leverage the ability of CAPI to rotate CA.

Based on the above discussion, I think the feature is not available right now? As far as I can see, rotate CA seems an important feature. May I know what is the reason why this feature is blocked? I have investigated native k8s support CA rotation. Is there any technical blocker of CAPI implementation of CA rotation?

As I'm new to this project, could you please give me some help?

Thanks

fabriziopandini commented 1 year ago

You are correct, this is an important feature, and unfortunately, it is not yet available right now. However, nothing blocks you or someone else from working on this topic, which IMO requires a proposal where we describe how to do something similar to https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/ while respecting CAPI constraints (e.g. immutability / no direct access to the machines).

This requires some research...

BarthV commented 1 year ago

Copy paste from my last Kubernetes slack's message :

So after spending days on this topic, I've finally found the least awful way to rotate CA in a CAPI managed cluster. This method relies on 3 phases machine rollout, including a "big-bang" control-planes live cert renew.

Phase 1 :

Phase 2 "bigbang" (on all CP nodes at the same time) :

Phase 3 :

And tadaaam ... it works.

this is still very "manual" (I really hate SSH & remote actions) but we're facing here multiple capi & kubeadm limitations. They are preventing us to automate this CA rollout propelly 1 node at a time.

fabriziopandini commented 1 year ago

It will be great to document this in the book...

fabriziopandini commented 5 months ago

/priority important-longterm