All the endpoints should be reachable from all the nodes
Current Behavior
Drain multiple nodes in a short time
Some endpoints, not necessarily the endpoints belonging to the pods which were on the drained nodes, suddenly become unreachable from some of the worker nodes - even if those workers were not drained / left untouched
Possible Solution
So far restarting either of the nodes is the only workaround that I am aware of. Or explicitly deleting the pod, so that a new endpoint gets created.
Steps to Reproduce (for bugs)
This is difficult to reproduce as I couldn't find a deterministic pattern. I have seen this on a cluster with 1000+ endpoints
Make a list of all the endpoints in the cluster
Have at least 2-3 nodes with ~20 endpoints
Drain these nodes around the same time
Curl all the endpoints in (1) except those in (2) from ALL the nodes
You will notice that curl fails with Failed to connect to <endpoint> port 80 after X ms: Couldn't connect to server on some of the nodes for some of the endpoints
You will also notice that on the nodes curl pails, other endpoints belonging to the Pods running on the same node are reachable. It's only some endpoint(s) belonging to that node not reachable.
Context
Take the following example, where myapp-stg-c8ccd55b6-2jld6 (192.168.230.138) Pod on mynodezonea1 is not reachable from mylinuxnodezoneb9.
avinesh@mylinuxnodezoneb9:/$ curl 192.168.230.138
curl: (28) Failed to connect to 192.168.230.138 port 80 after 132207 ms: Couldn't connect to server
kubectl describe output of mynodezonea1:
PS C:\> k describe node mynodezonea1 --cluster=test
Name: mynodezonea1
Roles: worker
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=windows
kubernetes.io/arch=amd64
kubernetes.io/hostname=mynodezonea1
kubernetes.io/os=windows
node-role.kubernetes.io/worker=worker
node.kubernetes.io/windows-build=10.0.20348
topology.kubernetes.io/region=myregion
topology.kubernetes.io/zone=zonea
Annotations: node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 10.228.88.200/22
projectcalico.org/IPv4VXLANTunnelAddr: 192.168.230.129
projectcalico.org/VXLANTunnelMACAddr: <trimmed>
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: <trimmed>
Taints: os=windows:NoSchedule
Unschedulable: false
Lease: Failed to get lease: leases.coordination.k8s.io "mynodezonea1" is forbidden: User "username" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Sat, 19 Oct 2024 02:17:50 -0400 Sat, 19 Oct 2024 02:17:50 -0400 CalicoIsUp Calico is running on this node
MemoryPressure False Mon, 18 Nov 2024 14:01:24 -0500 Sat, 19 Oct 2024 02:17:49 -0400 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Mon, 18 Nov 2024 14:01:24 -0500 Sat, 19 Oct 2024 02:17:49 -0400 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Mon, 18 Nov 2024 14:01:24 -0500 Sat, 19 Oct 2024 02:17:49 -0400 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Mon, 18 Nov 2024 14:01:24 -0500 Sat, 19 Oct 2024 02:17:49 -0400 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 10.228.88.200
Hostname: mynodezonea1
Capacity:
<trimmed>
Allocatable:
<trimmed>
System Info:
Machine ID: mynodezonea1
System UUID: <trimmed>
Boot ID: <trimmed>
Kernel Version: <trimmed>
OS Image: Windows Server 2022 Standard
Operating System: windows
Architecture: amd64
Container Runtime Version: containerd://1.6.26
Kubelet Version: v1.27.12
Kube-Proxy Version: v1.27.12
PodCIDR: 192.168.236.0/24
PodCIDRs: 192.168.236.0/24
Non-terminated Pods: (18 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
<trimmed>
app myapp-prod-6c94f965f8-6pvhg 250m (1%) 1 (7%) 512M (0%) 1G (1%) 30d
app myapp-stg-c8ccd55b6-2jld6 250m (1%) 1 (7%) 512M (0%) 1G (1%) 30d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
<trimmed>
As you can see there is myapp-prod-6c94f965f8-6pvhg and other endpoints that I trimmed, on this node which are reachable from mylinuxnodezoneb9. Only the endpoint belonging to myapp-stg-c8ccd55b6-2jld6 is not.
Here is kubectl describe node for mylinuxnodezoneb9:
k describe node mylinuxnodezoneb9 --cluster=test
Name: mylinuxnodezoneb9
Roles: worker
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=mylinuxnodezoneb9
kubernetes.io/os=linux
node-role.kubernetes.io/worker=worker
topology.kubernetes.io/region=myregion
topology.kubernetes.io/zone=zoneb
Annotations:
node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 10.219.61.43/24
projectcalico.org/IPv4VXLANTunnelAddr: 192.168.235.64
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: <trimmed>
Taints: <none>
Unschedulable: false
Lease: Failed to get lease: leases.coordination.k8s.io "mylinuxnodezoneb9" is forbidden: User "username" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Sun, 03 Nov 2024 21:45:53 -0500 Sun, 03 Nov 2024 21:45:53 -0500 CalicoIsUp Calico is running on this node
MemoryPressure False Mon, 18 Nov 2024 14:11:26 -0500 Sun, 03 Nov 2024 21:45:50 -0500 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Mon, 18 Nov 2024 14:11:26 -0500 Sun, 03 Nov 2024 21:45:50 -0500 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Mon, 18 Nov 2024 14:11:26 -0500 Sun, 03 Nov 2024 21:45:50 -0500 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Mon, 18 Nov 2024 14:11:26 -0500 Sun, 03 Nov 2024 21:45:50 -0500 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 10.219.61.43
Hostname: mylinuxnodezoneb9
Capacity:
<trimmed>
Allocatable:
<trimmed>
System Info:
Machine ID: <trimmed>
System UUID: <trimmed>
Boot ID: <trimmed>
Kernel Version: <trimmed>
OS Image: Red Hat Enterprise Linux 8.10 (Ootpa)
Operating System: linux
Architecture: amd64
Container Runtime Version: cri-o://1.27.4
Kubelet Version: v1.27.12
Kube-Proxy Version: v1.27.12
PodCIDR: 192.168.224.0/24
PodCIDRs: 192.168.224.0/24
Non-terminated Pods: (37 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
<trimmed>
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
<trimmed>
And the PodCIDR range is correctly registering in iptables on mylinuxnodezoneb9
avinesh@mylinuxnodezoneb9:/$ /usr/sbin/ip route | grep "192.168.236.0"
192.168.236.0/26 via 192.168.236.8 dev vxlan.calico onlink
And, as mentioned earlier, myapp-prod-6c94f965f8-6pvhg endpoint which is also running on the very same node is reachable from
k get pods -A --cluster=test -o wide | grep myapp-prod-6c94f965f8-6pvhg
app myapp-prod-6c94f965f8-6pvhg 1/1 Running 0 30d 192.168.230.189 mynodezonea1 <none> <none>
avinesh@mylinuxnodezoneb9:/$ curl 192.168.230.189
avinesh@mylinuxnodezoneb9:/$
So it's not like node-to-node networking is completely broken.
I do not use any networking policy, nor are there firewall restrictions at the host level.
Your Environment
Calico version: 3.28.0
Calico dataplane: iptables & windows
Orchestrator version (e.g. kubernetes, mesos, rkt): Kubernetes 1.27
Operating System and version: Linux & Windows Server 2022
Expected Behavior
All the endpoints should be reachable from all the nodes
Current Behavior
Possible Solution
So far restarting either of the nodes is the only workaround that I am aware of. Or explicitly deleting the pod, so that a new endpoint gets created.
Steps to Reproduce (for bugs)
This is difficult to reproduce as I couldn't find a deterministic pattern. I have seen this on a cluster with 1000+ endpoints
Failed to connect to <endpoint> port 80 after X ms: Couldn't connect to server
on some of the nodes for some of the endpointsContext
Take the following example, where
myapp-stg-c8ccd55b6-2jld6
(192.168.230.138
) Pod onmynodezonea1
is not reachable frommylinuxnodezoneb9
.kubectl describe
output ofmynodezonea1
:As you can see there is
myapp-prod-6c94f965f8-6pvhg
and other endpoints that I trimmed, on this node which are reachable frommylinuxnodezoneb9
. Only the endpoint belonging tomyapp-stg-c8ccd55b6-2jld6
is not.Here is
kubectl describe node
formylinuxnodezoneb9
:And the PodCIDR range is correctly registering in iptables on
mylinuxnodezoneb9
And, as mentioned earlier,
myapp-prod-6c94f965f8-6pvhg
endpoint which is also running on the very same node is reachable fromSo it's not like node-to-node networking is completely broken. I do not use any networking policy, nor are there firewall restrictions at the host level.
Your Environment