Open jirenugo opened 2 weeks ago
The KMS provider itself runs as a pod on the cluster.
I'm not familiar with this deployment pattern for KMS providers - why are you trying to do this? It suffers from the obvious chicken-and-egg problem you're running into here, where the cluster can't start because it needs access to something that won't be available until after it's up.
You're trying to figure out how to lock your keys in the car but still open the door. I don't think there's a good way to make this work.
The KMS provider itself runs as a pod on the cluster.
This is not an uncommon pattern for KMS deployment. Arguably k3s has a circular dependency on kubernetes secrets. It is unfortunate that this is not part of the conformance tests, at least as far as I can tell.
https://github.com/kubernetes-sigs/aws-encryption-provider https://github.com/Azure/kubernetes-kms?tab=readme-ov-file https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/barbican-kms-plugin/using-barbican-kms-plugin.md https://github.com/Tencent/tke-kms-plugin/blob/90b71a5c7d78a564567040ebe1ce7135afe99ce5/deployment/tke-kms-plugin.yaml#L4
K3s uses secrets for a couple things internally:
Both of these should soft-fail and retry until secrets can be read. Where exactly does k3s startup stall?
I see that https://github.com/kubernetes-sigs/aws-encryption-provider for example suggests running the KMS as a static pod - are you doing that by placing the pod spec in a file in /var/lib/rancher/k3s/agent/pod-manifests/
, or are you trying to deploy it via kubectl apply
?
suggests running the KMS as a static pod
Yes. Static pods have the same issue.
Both of these should soft-fail and retry until secrets can be read. Where exactly does k3s startup stall?
I don't know. I attached the logs from the systemd service in the issue where it's trying to access /registry/secrets/kube-system/k3s-serving
. Does that answer your question? Why does it hard fail on this secret? I can get more logs if you share instructions.
Environmental Info: K3s Version:
Node(s) CPU architecture, OS, and Version:
Cluster Configuration:
Describe the bug:
Steps To Reproduce:
curl -sfL https://get.k3s.io | sh -s - server --cluster-init --write-kubeconfig-mode 644
.yaml
file under/etc/rancher/k3s/config.yaml.d/
with the following contents. The KMS provider itself runs as a pod on the clustersystemctl restart k3s
sudo systemctl start k3s
Expected behavior:
k3s starts up successfully and starts the KMS pod
Actual behavior:
k3s is waiting for the KMS pod to come up to start the KMS pod because it attempts to decrypt a secret (
/registry/secrets/kube-system/k3s-serving
) that is now encrypted by the KMS providerAre there any workarounds for this issue? Is it possible to configure k3s to store the bootstrap secrets as a different resource type so that they may be exempted from KMS encryption.
Additional context / logs:
Logs from the systemd service attempting to decrypt the secret protected by KMS: