Open xyzzyz opened 2 months ago
Yeah, the problem is that it's not set explicitly by RKE2, so it uses the default for the Kubernetes version the CCM is built against - and the K3s CCM is built against 1.29 (as you noted).
The best current work-around is to set this in your config.yaml:
kube-cloud-controller-manager-arg:
- 'feature-gates=PodHostIPs=false'
I'm not sure this has changed at all in these releases. Is this meant to be in Working status still, not To Test? @brandond
I checked on release-1.28 branch commitid 2c90f3baa0dbd555d5972db542421f3d2cded7b5
, and see the following still:
$ kubectl get pods -n kube-system -A -o yaml | grep image: | grep 1.29
image: index.docker.io/rancher/rke2-cloud-provider:v1.29.3-build20240515
image: docker.io/rancher/rke2-cloud-provider:v1.29.3-build20240515
image: index.docker.io/rancher/rke2-cloud-provider:v1.29.3-build20240515
image: docker.io/rancher/rke2-cloud-provider:v1.29.3-build20240515
image: index.docker.io/rancher/rke2-cloud-provider:v1.29.3-build20240515
image: docker.io/rancher/rke2-cloud-provider:v1.29.3-build20240515
Also I will note that I'm not able to reproduce the issue exactly other than this. For my steps, I need to include enable-servicelb: true
in the config.yaml and NOT disable the cloud-controller, otherwise the cluster either doesn't come up correctly or there is no svclb pod created when creating a service of type LoadBalancer.
No sorry, I think I moved this over on accident. This can go back to next up until July.
Environmental Info: RKE2 Version: v1.28.9+rke2r1
Node(s) CPU architecture, OS, and Version: Linux dev 4.18.0-552.el8.x86_64 #1 SMP Sun Apr 7 19:39:51 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux, CentOS 8
Cluster Configuration: 1 server, 0 agents
Describe the bug: ServiceLB fails to come up
Steps To Reproduce:
Expected behavior: ServiceLB Load Balancer comes up
Actual behavior: LoadBalancer Service is stuck pending, with the following in events:
Additional context / logs: This is caused by version mismatch of rke2-cloud-provider and kubernetes apiserver.
rke2-cloud-provider
decides whether to use HostIPs ref based on what's enabled by default on the k8s version cloud-provider is compiled with. If the version is 1.29, rke2-cloud-provider believes that PodHostIPs is available, but if k8s version is actually 1.28, it's not enabled by default, so it breaks.3 weeks ago, cloud-provider was bumped to 1.29 on 1.28 and 1.27 release branches, see e.g. this. Indeed:
To fix,
rke2-cloud-provider
should obtain the actual feature gate state, instead of whatever's default for the given k8s version.