Open nicolas-goudry opened 9 months ago
Hi @nicolas-goudry, maybe I didn't fully understood your context but I am a bit confused on your motivation on why you would like to generate certificate outside of kubeadm/kubespray but still makes kubespray handles the heavy works of generating certs with a new role.
I would understand if you would do something along the line of "here is an existing intermediate CA that integrates with my PKI, please kubespray use that". Although I still think this might be a disputed idea since Kubernetes doesn't handle cert revocation which means isolated PKI for each kubernetes clusters might still be the best choice...
About your issues about trusting the kubernetes PKI, AFAIK you shouldn't need this I think. In every places where you would have some kind of token/client cert to contact the Kubernetes API server there is also the associated CA so that clients like kubectl can verify the chain of trust. So for instance the curl command failing that you posted is pretty common into the Kubernetes world and even more when you have distinct PKI per Kubernetes cluster which is also pretty common and even somewhat recommended IMO. So in theory this should work like the regulars certs that kubespray/kubeadm generate however you need to place that at the correct places in the fs and make sure that some of the tools that you might shortcut because you would be handling the certificates yourself does not prevent Kubernetes from being functional. NGL this sounds kinda hard and you would probably need to do small changes/conditions all over the places in kubespray.
So in a nutshell my opinion on this is that it needs to be properly motivated to be integrated as I fear that it might be too much complexity for Kubespray with little benefits and even might encourage bad practices for users that sees this and start using it... I hope you will not find this a too harsh comment as this look like already a lot of works considering the amount of details you put in your issue.
Hi @MrFreezeex, first of all thanks for your thorough answer. Don’t worry, no hard feelings against your opinion, I believe we are here to discuss and debate in a sane way. Everyone has its own point of view and I respect that.
Regarding the motivation, in a nutshell I spin up clusters for my customers using Kubespray. Some of them have expressed the desire to use their own CA in order to sign their cluster’s certificates.
BTW I use the term “root CA” to describe the CA used to sign the intermediate cluster CAs, but it will likely be an intermediate CA generated by the customer. It would end up being a trust chain looking something like that:
root CA # customer-owned, not shared with me
└── intermediate CA # customer-owned, shared with me and provided to custom role
├── kubernetes CA # generated by custom role
│ └── kubernetes certs # generated by kubeadm (right?)
├── etcd-ca CA # generated by custom role
│ └── etcd certs # generated by Kubespray (Bash script)
└── front-proxy-ca CA # generated by custom role
└── front-proxy certs # generated by kubeadm (right?)
TBH, I don’t understand myself why someone would need to do such a thing, for all the reasons you expressed, but the need is here so I must comply… If you strongly believe this doesn’t have its right place in Kubespray, I would totally understand that. In the end it’s only one user needs, and I would definitely be ok with my need being fulfilled by the custom role I crafted. I was just wondering if this would be a useful addition to Kubespray.
Now, about the SSL verification issue, I used curl
to demonstrate the issue in a simple way, but I know the behavior is the same with a plain Kubespray cluster installation. The real issue is with the kubernetes
Ansible module as it behaves differently with Kubespray/kubeadm certificates handling than with my custom role. For a reason that I can’t explain, the following doesn’t work when CA generation is handled by my role:
ansible -u <remote-user> -b --become-user=root -i <inventory> -m pip -a "name=kubernetes" <node>
ansible -u <remote-user> -b --become-user=root -i <inventory> -m kubernetes.core.k8s_info -a "kind=Pod namespace=kube-system" <node>
AFAICT, the module is using the default kubeconfig path, which is /root/.kube/config
for the root
user (since I’m using become
). What doesn’t make sense is that if I use this kubeconfig with kubectl
, it works seamlessly!
I compared the certificates with both methods and found out that with plain Kubespray (ie. kubeadm) the kubernetes
CA is self-signed. With my custom role, the kubernetes
CA is signed by the « root CA » (which is expected since it’s the whole point of the role). I believe that the issue is lying there, but even if I add the « root CA » (which is self-signed) to the trusted CAs of the control plane hosts, the error persists.
If I understand PKI correctly, this shouldn’t happen, since the root CA in the chain is trusted, all subsequent CAs/certs should be trusted as well… Or am I wrong about this?
I checked that the « root CA » is indeed trusted:
$ sudo trust list --filter=ca-anchors | grep -i external -A2 -B2
pkcs11:id=%AD%BD%98%7A%34%B4%26%F7%FA%C4%26%54%EF%03%BD%E0%24%CB%54%1A;type=cert
type: certificate
label: AddTrust External CA Root
trust: anchor
category: authority
In the end, I think this issue is beyond the scope of Kubespray. Feel free to close it if you want to. I’ll try to get some help on SO, linking back to here. I’d still appreciate it if you have any insights on this matter though :slightly_smiling_face:
BTW I use the term “root CA” to describe the CA used to sign the intermediate cluster CAs, but it will likely be an intermediate CA generated by the customer. TBH, I don’t understand myself why someone would need to do such a thing, for all the reasons you expressed, but the need is here so I must comply… If you strongly believe this doesn’t have its right place in Kubespray, I would totally understand that. In the end it’s only one user needs, and I would definitely be ok with my need being fulfilled by the custom role I crafted. I was just wondering if this would be a useful addition to Kubespray.
Ah ok this makes (a bit) more sense thanks for the additional explanation. So I think it kinda depends on the impact on kubespray if it is really only the changes that you sent + possibly a few more things I would tends to think it should be acceptable. I would put some disclaimers in a few places to suggest that people should try to avoid using this though...
About your issues that is weird indeed I would expect that the kubernetes ansible role would use the config that you are pointing out, maybe you could try overriding some of the parameters here like validate_certs
as a mean to check what's happening. FYI we mostly do not kubernetes.core
in kubespray as we have our own equivalent as of right now so I am not certain what it does by default tbh. Maybe you could try to see what kubectl command is really execute behind the hood as well.
Apart from that from what I remember the PKI seems right to me, but don't bet on this my knowledge of all the kubernetes PKIs are not very fresh!
If I understand PKI correctly, this shouldn’t happen, since the root CA in the chain is trusted, all subsequent CAs/certs should be trusted as well… Or am I wrong about this?
Not sure in your specific cases but afaik to verify that a cert is trusted it needs to verify the full chain up to the certificate (and thus know all the certs up to that point) it needs to verify so some of your problems might be related to that.
Ok, I’ll find some time to try and include this into Kubespray with as few changes as possible and massive warnings.
I did some tests today and I think I understand why the kubernetes.core
module is not working in this case:
First of all, as a reminder, the cluster CA certificate ends up (no matter what) in the kubeconfig file under clusters[].cluster.certificate-authority-data
.
When we let kubeadm generate its PKI, the kubernetes
CA is self-signed. But since it is advised to clients in the kubeconfig certificate-authority-data
, it is implicitly trusted even if it’s self-signed.
Now, when the kubernetes
CA is signed by another CA, nothing changes in the kubeconfig file: only the (now intermediate) kubernetes
CA is advised through the certificate-authority-data
field. This seems to be the issue. Clients now only get part of the trust chain and therefore may not trust the connection since they cannot verify the whole chain.
I believe (well, it’s more of a guess here) that kubectl
is “lazy” and doesn’t check for the whole chain but only looks at the end of the chain (kubernetes CA -> apiserver cert) to validate the connection. Other tools, like kubernetes.core
modules, seems to be expecting the whole chain, as specified in the documentation of the ca_cert
parameter:
Path to a CA certificate used to authenticate with the API. The full certificate chain must be provided to avoid certificate validation errors.
Therefore, in order to fix this, I either have to:
kubernetes
CA to the host’s trusted CAskubernetes.core
modules with the whole certificate chain bundle.# On some host
cat /etc/kubernetes/ssl/ca.crt /wherever/is/located/customer/provided/ca/cert | sudo tee /etc/ssl/certs/kubernetes-ca-bundle.crt
# On controller
ansible -u <remote-user> -b --become-user=root -i <inventory> -m kubernetes.core.k8s_info -a "kind=Pod namespace=kube-system ca_cert=/etc/ssl/certs/kubernetes-ca-bundle" <node>
I noticed that Kubespray uses its own module to interact with the cluster, I’ll give this a try later on. Just to satisfy my curiosity: why did you went this path instead of using the “official” Ansible module for Kubernetes? Is it because the module didn’t exist yet when Kubespray needed it?
I noticed that Kubespray uses its own module to interact with the cluster, I’ll give this a try later on. Just to satisfy my curiosity: why did you went this path instead of using the “official” Ansible module for Kubernetes? Is it because the module didn’t exist yet when Kubespray needed it?
I was not around at that time but I suspect yes as it was first introduced in 2015 it seems... There is more details on a potential shift to the "official" modules here FYI: https://github.com/kubernetes-sigs/kubespray/issues/10696
I noticed that Kubespray uses its own module to interact with the cluster, I’ll give this a try later on. Just to satisfy my curiosity: why did you went this path instead of using the “official” Ansible module for Kubernetes? Is it because the module didn’t exist yet when Kubespray needed it?
Probably that. Also, kubernetes.core.k8s requires to install python packages on the managed hosts, not only on the ansible control node, and kubespray does not have infrastructure to do that in a fine-grained way (yet, I'm working on it for #10701 )
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale /lifecycle frozen
I need to setup a cluster with Kubespray while using an external CA.
Since this process is not documented, I made a search and read this issue.
I managed to make this work by using the following custom role associated with a playbook:
defaults/main.yaml
tasks/main.yaml
To better explain:
use_external_ca
istrue
root_ca_cert
androot_ca_key
are missingcryptography
module which is needed by somecommunity.crypto
modulesetcd_deployment_type
Notes:
cert_management
value must still be set toscript
in order to generate all the required certificates. Those certificates will be signed by the intermediate CAs generated by the role, since the certificate management scripts don’t try to regenerate the CAs if they already exists.cert_management
tonone
would require to generate not only the CAs, but also all certificates.While this is working fine, I later found that communicating with the API server with curl would fail from any of the control plane hosts (didn’t try from node hosts):
However, if I provide the
--cacert
flag pointing to/etc/kubernetes/ssl/ca.crt
it works. So I added these tasks to the role:This way, the kubernetes CA generated and signed by the root CA is trusted by the host, and the above curl command works.
The main issue is that I’m using Ansible’s
kubernetes.core.k8s_info
module to interact with the cluster and without adding the kubernetes CA to the host trusted CAs it fails with:However, if I let Kubespray generate all CAs and certificates, I do not have this issue. How is that possible? I checked without running my play and I don’t see the Kubernetes CA added to host trusted CAs so I don’t understand how it could work…
As a side note, I think it would be great to have some documentation about the process of using an external CA. If the maintainers think this would be a great addition, I would like to work on this, but I don’t know where to start so some guidance would be greatly appreciated.
Finally, I do think it would be great to include something like the role I wrote directly into Kubespray to allow users to setup their clusters with an external CA like I did. Again, I’m ok to work on this but don’t really know where to start.
I guess my two last points may need to live in other issues, you tell me.