Closed taha-adel closed 1 month ago
Calico version: v3.22.2
This is a very old version of Calico that is no longerin support. I recommend upgrading to a modern version.
12:52:27.781606 IP 10.233.0.1.45208 > 10.0.55.68.6443: Flags [S], seq 3350995251, win 65495, options [mss 65495,sackOK,TS val 1474248146 ecr 0,nop,wscale 7], length 0
Could you provide a bit more of the tcpdump output as well as some more information about what each IP address belongs to? e.g., what is 10.0.55.68?
Thank you @caseydavenport for your reply.
Let me further explain this. We have here three master nodes running kube-apiserver with the following IPs:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node1 Ready control-plane,master 2y67d v1.23.6 10.0.55.68 <none> Ubuntu 18.04.6 LTS 4.15.0-213-generic containerd://1.6.4
node2 Ready control-plane,master 2y67d v1.23.6 10.0.55.69 <none> Ubuntu 18.04.6 LTS 4.15.0-213-generic containerd://1.6.4
node3 Ready control-plane,master 2y67d v1.23.6 10.0.55.70 <none> Ubuntu 18.04.6 LTS 4.15.0-213-generic containerd://1.6.4
and here are all information related to the kube-apiserver service:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 2y67d
NAME ENDPOINTS AGE
kubernetes 10.0.55.68:6443,10.0.55.69:6443,10.0.55.70:6443 2y67d
The issue only occurs when I try to reach the kube-apiserver via its service IP from one of the nodes.
When I try to initiate a telnet 10.233.0.1 443
from node1
, I get success response every three tries (the first two tries will timeout and the third will get connected). The succeeded request is what's routed to node1
.
Actually, I don't know what's the more info that I can get from tcpdump
. The only meaningful information I can get is src IP/Port and dst IP/Port. Anyway, here are more packet captures:
06:16:50.972450 IP 10.233.0.1.35700 > 10.0.55.69.6443: Flags [S], seq 4034289792, win 65495,
options [mss 65495,sackOK,TS val 1049327718 ecr 0,nop,wscale 7], length 0
E..<..@.?.9.
...
.7E.t.+.v`.........L].........
>.xf........
06:16:51.432410 IP 10.233.0.1.49798 > 10.0.55.68.6443: Flags [S], seq 1006463059, win 65495,
options [mss 65495,sackOK,TS val 1536912646 ecr 0,nop,wscale 7], length 0
E..<..@.?...
...
.7D...+;.hS.........2.........
[.m.........
06:16:51.976770 IP 10.233.0.1.35700 > 10.0.55.69.6443: Flags [S], seq 4034289792, win 65495,
options [mss 65495,sackOK,TS val 1049328722 ecr 0,nop,wscale 7], length 0
E..<..@.?.9.
...
.7E.t.+.v`.........L].........
>.|R........
06:16:53.311681 IP 10.0.55.68.27432 > 10.0.55.69.6443: Flags [P.], seq 218:264, ack 37214, wi
n 1513, options [nop,nop,TS val 2479777417 ecr 3120908656], length 46
E..bR.@.?.f}
.7D
.7Ek(.+.>.?;V.............
..f...Ip....).............r..p|..F...L.......i,..Y....
06:16:53.311965 IP 10.0.55.69.6443 > 10.0.55.68.27432: Flags [.], ack 264, win 501, options [nop,nop,TS val 3120912778 ecr 2479777417], length 0
E..4..@.@...
.7E
.7D.+k(;V...>.m.....k.....
..Y...f.
06:16:53.314658 IP 10.0.55.69.6443 > 10.0.55.68.27432: Flags [P.], seq 37214:37310, ack 264, win 501, options [nop,nop,TS val 3120912781 ecr 2479777417], length 96
E.....@.@..W
.7E
.7D.+k(;V...>.m...........
..Y...f.....[.....!...A...y.J..........R.H.....o.........B............I{.....2./..q.;M...u...F..L......3
06:16:53.314680 IP 10.0.55.69.6443 > 10.0.55.68.27432: Flags [P.], seq 37310:37674, ack 264, win 501, options [nop,nop,TS val 3120912781 ecr 2479777417], length 364
E.....@.@..J
.7E
.7D.+k(;V...>.m...........
..Y...f.....g.....!....`^....-@....Ja}<.+...[...@......7[..>..}.B1.. D.u.6...&PK...{....}.].q.....?2".......tg.1..[....5...!?.(..*L...L..N..0Z.PM.=....@..%.n..Y..F..`O.",.).O(...).l.~...h.Ho%...&.+M.o.Zp?..S.........H..z.....L..]...v....a..j.......~.j.....$.
I get success response every three tries (the first two tries will timeout and the third will get connected).
It's feels related that there are three hosts running the apiserver, and you get a response every three tries. Sounds like round-robin load balancing and only the requests forwarded to the local API server are succeeding.
Are you by chance using the kube-proxy in IPVS mode?
To figure out where the NAT might be occurring, I'd look at the output of iptables-save -c
on node1
while sending traffic and check for incrementing counters on rules with SNAT
or MASQUERADE
actions. Note that if you're using IPVS kube-proxy, this will be less effective since the load balancing will be performed in IPVS instead.
@caseydavenport Yes I'm using Kube-proxy in IPVS mode.
$ ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.233.0.1:443 rr
-> 10.0.55.68:6443 Masq 1 1 0
-> 10.0.55.69:6443 Masq 1 6 0
-> 10.0.55.70:6443 Masq 1 1 0
I'm afraid I'm not much of an expert on the IPVS kube-proxy, but it looks like the NAT you're seeing is happening in IPVS based on the Masq
denotation in the output you provided. Perhaps there's an IPVS proxy configuration option to turn off that masquerade?
That said, I am not sure if the MASQ is necessarily a problem - could just be a cross-node connectivity issue (e.g., security group configuration) blocking one node from talking to the others.
Generally I advise against using IPVS unless it's absolutely necessary - it doesn't have a lot of support upstream in k8s these days.
Issue resolved after restarting the nodes.
Cluster Description
I've a three nodes Kubernetes cluster deployed via Kubespray utility. All control plane components deployed in the node network via
hostNetwork: true
parameter. Here are some information about the cluster networkingIssue Description
When I run any pod, Calico assign IP address to the pod and Kubelet tries to reach kube-apiserver via its service IP but it times out keeping the pod in
ContainerCreating
state. I usedtcpdump
to check the packets between Kubelet and kube-apiserver and I found that all packets are SNATed to the service IP as shown belowExpected Behavioro
Destination IP is DNATed to the IP of the service endpoint and the source IP is kept as the node IP.
Current Behavior
Destination IP is DNATed to the IP of the service endpoint and the source IP is SNATed to the Service IP
Possible Solution
Steps to Reproduce (for bugs)
10.233.0.1 443
Context
Your Environment