What happened:
We are experiencing issues deploying Kubernetes clusters version 1.29 and above on Huawei Cloud. Without passing the --node-ip flag to the kubelet service, INTERNAL-IP addresses for nodes are shown as <none>:
root@k8s-master-001:# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master-001 NotReady <none> 25s v1.29.7 <none> <none> Ubuntu 22.04.4 LTS 5.15.0-117-generic containerd://1.7.20
We believe this is due to these changes in the Kubernetes 1.29: https://github.com/kubernetes/kubernetes/pull/121028
Specifically, if the kubelet is started with the --cloud-provider=external flag and --node-ip is not specified, the external cloud-controller-manager should pass the IP address.
To solve issue-related problems, a deployment strategy was suggested in a PR's comment, where the external cloud-controller-manager is deployed as a static pod or the --node-ip flag is used: https://github.com/kubernetes/kubernetes/pull/121028#issuecomment-2256834163
We tried the following steps: initialized the cluster using kubeadm, passed --cloud-provider=external (for all controlplane components) and --node-ip=<node_address> to kubelet on master nodes, deployed cni (the ccm falls down trying to get extension-apiserver-authentication configmap otherwise) and huaweicloud-controller-manager version v0.26.8 according to the documentation. The logs show:
I0807 07:22:11.239701 1 leaderelection.go:253] failed to acquire lease kube-system/cloud-controller-manager
I0807 07:22:15.289291 1 request.go:1370] body was not decodable (unable to check for Status): provided data does not appear to be a protobuf message, expected prefix [107 56 115 0]
E0807 07:22:15.289309 1 leaderelection.go:330] error retrieving resource lock kube-system/cloud-controller-manager: the server rejected our request for an unknown reason (get leases.coordination.k8s.io cloud-controller-manager)
Question: Is huaweicloud-controller-manager version v0.26.8 incompatible with Kubernetes versions 1.29 and above, or are we missing something in our setup?
What you expected to happen:
The huaweicloud-controller-manager returns no errors and internal IPs become visible on worker nodes.
How to reproduce it (as minimally and precisely as possible):
Initialize a Kubernetes cluster version 1.29+ using kubeadm (we have tried versions 1.30.3 and 1.29.7)
Deploy cni (we used cilium 1.16)
Deploy huaweicloud-controller-manager version v0.26.8
Anything else we need to know?:
Same setup works fine with kubernetes versions <1.29 (there are no problems with node IPs and huaweicloud-controller-manager as well)
Environment:
Kubernetes version (use kubectl version): 1.29.7
Cloud provider or hardware configuration: HuaweiCloud
OS (e.g: cat /etc/os-release): Ubuntu 22.04.4 LTS
Kernel (e.g. uname -a): 5.15.0-117-generic
Install tools: Kubeadm
Network plugin and version (if this is a network-related bug): cilium 1.16.0
The current architecture that CCM relies on is not compatible with the 1.29 cluster. It needs to be upgraded to be compatible with version 1.29 and above.
What happened: We are experiencing issues deploying Kubernetes clusters version 1.29 and above on Huawei Cloud. Without passing the
--node-ip
flag to the kubelet service, INTERNAL-IP addresses for nodes are shown as<none>
:We believe this is due to these changes in the Kubernetes 1.29: https://github.com/kubernetes/kubernetes/pull/121028 Specifically, if the
kubelet
is started with the--cloud-provider=external
flag and--node-ip
is not specified, the externalcloud-controller-manager
should pass the IP address. To solve issue-related problems, a deployment strategy was suggested in a PR's comment, where the externalcloud-controller-manager
is deployed as a static pod or the--node-ip
flag is used: https://github.com/kubernetes/kubernetes/pull/121028#issuecomment-2256834163We tried the following steps: initialized the cluster using kubeadm, passed
--cloud-provider=external
(for all controlplane components) and--node-ip=<node_address>
to kubelet on master nodes, deployed cni (the ccm falls down trying to getextension-apiserver-authentication
configmap otherwise) andhuaweicloud-controller-manager
version v0.26.8 according to the documentation. The logs show:Question: Is
huaweicloud-controller-manager
version v0.26.8 incompatible with Kubernetes versions 1.29 and above, or are we missing something in our setup?What you expected to happen: The
huaweicloud-controller-manager
returns no errors and internal IPs become visible on worker nodes.How to reproduce it (as minimally and precisely as possible):
kubeadm
(we have tried versions 1.30.3 and 1.29.7)huaweicloud-controller-manager
version v0.26.8Anything else we need to know?: Same setup works fine with kubernetes versions <1.29 (there are no problems with node IPs and
huaweicloud-controller-manager
as well)Environment:
kubectl version
): 1.29.7cat /etc/os-release
): Ubuntu 22.04.4 LTSuname -a
): 5.15.0-117-generic