Closed grig0701 closed 1 year ago
cloud-provider-name: external
nodes \"worker-002\" not found"
What cloud provider have you deployed? The log messages suggest that something - most likely your cloud provider - is deleting the node from the cluster when it goes down for a reboot.
cloud-provider-name: external
nodes \"worker-002\" not found"
What cloud provider have you deployed? The log messages suggest that something - most likely your cloud provider - is deleting the node from the cluster when it goes down for a reboot.
Hi. Thank you for your reply. I am working on this project together with the author of the question. We are using Hetzner dedicated servers, one of them cannot be restored in the cluster as a worker and Hetzner Cloud instances as masters in the cluster. Reinstalling rke2-agent on the problem node and removing the worker secret via kubectl does not help to restore the dedicated node in the cluster. Also, the node does not appear in
kubectl get nodes -A
cloud-provider-name: external
nodes "worker-002" not found"What cloud provider have you deployed? The log messages suggest that something - most likely your cloud provider - is deleting the node from the cluster when it goes down for a reboot.
I also want to add that we use hcloud-cloud-controller-manager
Check the cloud controller pod log; I suspect that it is deleting the node for some reason. There is nothing in RKE2 itself that will delete the node resource.
Check the cloud controller pod log; I suspect that it is deleting the node for some reason. There is nothing in RKE2 itself that will delete the node resource.
Hello, thank you for your feedback. The problem was in hcloud-cloud-controller-manager. More precisely, this controller does not currently support the dedicated servers which we used for our workers. We decided not to use it, now everything works as it should.
Thanks for helping me understand what went wrong
We set up a highly available rke2 cluster with 3 masters and 3 workers and after we shut down the second worker for a day we noticed that it was not showing up in the cluster, when we turned it back on it never came back.
We tried reinstalling the agent with a previous uninstall via
/usr/local/bin/rke2-uninstall.sh
and also removed worker secret from clusterkubectl delete secrets worker-002.node-password.rke2 -n kube-system
. After we executedcurl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sh -
systemctl enable rke2-agent.service
systemctl start rke2-agent.service
But the worker did not return to the cluster with the following logs:
Mastes logs:
My configurations: Example master:
Example worker:
Please help to solve this
Version: