Closed Starttoaster closed 10 months ago
In case it is useful, here is my k0sctl config (note that I do not believe this is an error with k0sctl though):
apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: cluster1
spec:
hosts:
- openSSH:
address: 100.85.216.39
user: bb
port: 22
role: controller+worker
installFlags:
- --no-taints
- openSSH:
address: 100.72.143.21
user: bb
port: 22
role: controller+worker
installFlags:
- --no-taints
- openSSH:
address: 100.81.77.109
user: bb
port: 22
role: controller+worker
installFlags:
- --no-taints
k0s:
config:
spec:
network:
nodeLocalLoadBalancing:
enabled: true
type: EnvoyProxy
version: 1.26.7+k0s.0
dynamicConfig: false
Ah nevermind, it looks like this is actually solved in a newer version of k0s. Closing.
As a quick side note: Wasn't able to reproduce:
$ kubectl get node,po -owide -A
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node/k0s-3957-controller-0 Ready control-plane 14m v1.26.7+k0s 172.28.170.48 <none> Alpine Linux v3.18 6.1.62-0-virt containerd://1.6.21
node/k0s-3957-controller-1 Ready control-plane 13m v1.26.7+k0s 172.28.170.205 <none> Alpine Linux v3.18 6.1.62-0-virt containerd://1.6.21
node/k0s-3957-controller-2 Ready control-plane 13m v1.26.7+k0s 172.28.170.236 <none> Alpine Linux v3.18 6.1.62-0-virt containerd://1.6.21
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/coredns-7bf57bcbd8-5bl2j 0/1 Completed 0 4m24s 10.244.0.5 k0s-3957-controller-0 <none> <none>
kube-system pod/coredns-7bf57bcbd8-6g5rf 1/1 Running 0 13m 10.244.1.3 k0s-3957-controller-1 <none> <none>
kube-system pod/coredns-7bf57bcbd8-qrkvz 1/1 Running 0 4m17s 10.244.2.2 k0s-3957-controller-2 <none> <none>
kube-system pod/coredns-7bf57bcbd8-wctjk 0/1 Completed 0 14m 10.244.0.4 k0s-3957-controller-0 <none> <none>
kube-system pod/konnectivity-agent-5mz6v 1/1 Running 0 13m 172.28.170.236 k0s-3957-controller-2 <none> <none>
kube-system pod/konnectivity-agent-8nxm2 1/1 Running 0 13m 172.28.170.48 k0s-3957-controller-0 <none> <none>
kube-system pod/konnectivity-agent-kll7t 1/1 Running 0 13m 172.28.170.205 k0s-3957-controller-1 <none> <none>
kube-system pod/kube-proxy-8f2dn 1/1 Running 0 13m 172.28.170.205 k0s-3957-controller-1 <none> <none>
kube-system pod/kube-proxy-nwhrm 1/1 Running 0 14m 172.28.170.48 k0s-3957-controller-0 <none> <none>
kube-system pod/kube-proxy-phj5c 1/1 Running 0 13m 172.28.170.236 k0s-3957-controller-2 <none> <none>
kube-system pod/kube-router-gdndt 1/1 Running 0 13m 172.28.170.236 k0s-3957-controller-2 <none> <none>
kube-system pod/kube-router-j4b9w 1/1 Running 0 13m 172.28.170.205 k0s-3957-controller-1 <none> <none>
kube-system pod/kube-router-tfmql 1/1 Running 0 14m 172.28.170.48 k0s-3957-controller-0 <none> <none>
kube-system pod/metrics-server-7446cc488c-qfswk 1/1 Running 1 (12m ago) 14m 10.244.0.3 k0s-3957-controller-0 <none> <none>
kube-system pod/nllb-k0s-3957-controller-0 1/1 Running 0 12m 172.28.170.48 k0s-3957-controller-0 <none> <none>
kube-system pod/nllb-k0s-3957-controller-1 1/1 Running 0 12m 172.28.170.205 k0s-3957-controller-1 <none> <none>
kube-system pod/nllb-k0s-3957-controller-2 1/1 Running 0 12m 172.28.170.236 k0s-3957-controller-2 <none> <none>
Before creating an issue, make sure you've checked the following:
Platform
Version
v1.26.7+k0s.0
Sysinfo
`k0s sysinfo`
What happened?
I have a 3-node cluster that is comprised of 3 nodes with the Controller+Worker roles, and just tried out the relatively new node local loadbalancing feature. I see my first node is the current "leader" looking at my current leases in the kube-node-lease namespace. And I see that a single envoy Pod was deployed in kube-system for the third node in my cluster.
So I just shut down the 3rd node to test this out, which of course made envoy completely unreachable. And after about 15 minutes of waiting, the envoy loadbalancer Pod is still stuck in Terminating according to kubectl, without a new Pod being deployed. But I'm currently of the belief, looking at the documentation, that what was intended here was for all 3 of my Controller+Worker nodes to deploy their own envoy Pod. As the documentation states that all Worker nodes are meant to deploy one.
I'm not sure if this is at all relevant, but I was originally using an externalAddress loadbalancer for this cluster, and transitioned to envoy, in case that is the tiny detail that leads to what I believe is an unintended state.
Steps to reproduce
Expected behavior
I'd have expected all 3 of my "Workers", though they also serve the Controller role, to deploy an Envoy Pod.
Actual behavior
Only the 3rd node in my cluster has the Envoy Pod even though the other two nodes are also workers.
Screenshots and logs
No response
Additional context
I checked
/etc/k0s/k0s.yaml
on all 3 nodes and found thenodeLocalLoadBalancing
setting configured in them.