Closed dcardellino closed 2 years ago
The log you've shown is completely normal, since control-plane (and static workers) nodes could have no corresponding Machine objects.
But what do you mean by "flooded with CSRs"?
@kron4eg
Just a bunch of them:
k get csr -n kube-system ─╯
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-244dx 23h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-245mg 8h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-28pkd 22h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-3 <none> Pending
csr-2ccdv 23h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-3 <none> Pending
csr-2dxsc 10h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-2klcw 7h33m kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-2qwwc 21h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-1 <none> Pending
csr-47wth 4h58m kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-48vcl 6h16m kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-49jdp 15h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-1 <none> Pending
csr-4bkxt 178m kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-1 <none> Pending
csr-4cbcz 20h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-4dt8b 4h32m kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-3 <none> Pending
csr-4h4wl 7h21m kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-1 <none> Pending
csr-4pnmt 15h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-4r7tc 6h35m kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-1 <none> Pending
csr-4rbgx 6h19m kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-1 <none> Pending
csr-4snm6 11h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-2 <none> Pending
csr-4zn6b 19h kubernetes.io/kubelet-serving system:node:hcloud-k8s-core-stage-control-plane-3 <none> Pending
@dcardellino I'll move this issue to kubeone
@dcardellino what kubeone version are you running?
@kron4eg We are running the following version:
kubeone version
{
"kubeone": {
"major": "1",
"minor": "4",
"gitVersion": "1.4.3",
"gitCommit": "717787f2287964e5793d80ec8ca2c2169936b0ac",
"gitTreeState": "",
"buildDate": "2022-05-11T14:18:03Z",
"goVersion": "go1.18.1",
"compiler": "gc",
"platform": "linux/amd64"
},
"machine_controller": {
"major": "1",
"minor": "43",
"gitVersion": "v1.43.2",
"gitCommit": "",
"gitTreeState": "",
"buildDate": "",
"goVersion": "",
"compiler": "",
"platform": "linux/amd64"
}
}
@kron4eg Any updates here?
@dcardellino is it possible that you test-drive dev build from the master?
@kron4eg
Is there a Make target to build from master branch?
make build
will build dist/kubeone
I'm not really sure why kubelet keep generating new certificate requests, probably it's some kind of a bug in the kubelet. But I've observed that if I approve all pending certificates their generation is stopped.
@kron4eg Sorry for the delay.
But I still get this issue altough I use the latest release of kubeobe and did the suggested solution like in #2199:
This issue is fixed by restarting Kubelet on all control plane nodes after CCM initializes nodes. Kubelet will automatically generate new CSRs when starting, which we approve after a minute or so (we give some time to be sure that all CSRs soaked in).
@dcardellino Since #2199, we're not able to reproduce the issue any longer (neither manually, nor in the E2E tests). Can you give KubeOne 1.5.0 a try?
@xmudrii Sorry, I forgot to mention that I recently approved the pending CSRs manually with kubectl certificate approve <csr-name>
. And now everything works fine! But thank you!
Hey, I just wanted to let you know that I had the same issue today with KubeOne 1.5.4
and k8s 1.25.11
. Sadly, I also manually approved the CSRs before finding this issue.
I still have about 280 pending requests, but they seemingly get deleted when becoming older than 24h. I also have these log entries:
2024-01-31 15:15:21.715 E0131 14:15:21.715009 1 node_csr_approver.go:89] Reconciliation of request /csr-5bjq6 failed: failed to get machine for node 'staging-control-plane-3': failed to get machine for given node name 'staging-control-plane-3'
2024-01-31 15:15:21.715 I0131 14:15:21.714902 1 node_csr_approver.go:103] Reconciling CSR csr-5bjq6
2024-01-31 15:15:21.614 E0131 14:15:21.614419 1 node_csr_approver.go:89] Reconciliation of request /csr-zsg56 failed: failed to get machine for node 'staging-control-plane-2': failed to get machine for given node name 'staging-control-plane-2'
2024-01-31 15:15:21.614 I0131 14:15:21.614329 1 node_csr_approver.go:103] Reconciling CSR csr-zsg56
2024-01-31 15:15:21.515 E0131 14:15:21.515034 1 node_csr_approver.go:89] Reconciliation of request /csr-s99s4 failed: failed to get machine for node 'staging-control-plane-1': failed to get machine for given node name 'staging-control-plane-1'
2024-01-31 15:15:21.515 I0131 14:15:21.514952 1 node_csr_approver.go:103] Reconciling CSR csr-s99s4
even after running KubeOne again (exact command: kubeone apply --manifest kubeone.yaml --tfjson output.json --upgrade-machine-deployments --auto-approve
) and deleting the MachineController pod manually, so it gets recreated/restarted.
Either the fix didn't work or my MachineController somehow is running an old version (its image is quay.io/kubermatic/machine-controller:v1.56.2
currently). :thinking:
Hello together,
We deploy Kubernetes Clusters on Hetzner with kubermatic/kubeone which uses by default the kubermatic/machine-controller addon. I don't know if this is an issue, because (i think so) everything works fine.
But when I view the logs of the machine controller i see the following logs occuring every second:
And my kube-system namespace is flooded with csr's. I don't think this is good for my clusters.
Regards,
Dome