Closed ja-mitiankai closed 2 months ago
@ja-mitiankai nothing is standing out to me as especially wrong here - logs look normal. What is the main symptom that you're trying to diagnose?
Warning Unhealthy 29m kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refused
This looks like it happened 29m ago but hasn't happened since, and is a potentially normal transient event as part of bootstrapping, so long as it doesn't persist. Considering your BGP statis is "Established", it looks like everything is working correctly.
72 packets transmitted, 0 packets received, 100% packet loss
I would expect a pod to be able to ping this IP, so that's potentially suspicious. Are other pods having trouble communicating?
@caseydavenport I can't communicate across nodes now. He looks fine, but he's not actually working properly. I want to try to use tcpdump to capture packets. I'm sorry, I don't know much about tcpdump.
node1 ip: 172.17.114.24 (master)
node2 ip: 172.17.114.22 (worker)
The pod I tested the ping command with is running on node1. name: debug, ip: 192.192.125.143. command like this: kubectl run -i --tty --rm debug --image=busybox --restart=Never -- ping 192.192.134.72
ip: 192.192.134.72 is csi-node-driver-9zbv7, and it is running on node2
route
at node1:
192.192.125.128 0.0.0.0 255.255.255.192 U 0 0 0 *
192.192.125.137 0.0.0.0 255.255.255.255 UH 0 0 0 cali5331bf8f64b
192.192.125.138 0.0.0.0 255.255.255.255 UH 0 0 0 cali2f01c522b60
192.192.125.139 0.0.0.0 255.255.255.255 UH 0 0 0 calie5460d1dd77
192.192.125.140 0.0.0.0 255.255.255.255 UH 0 0 0 cali726ebf83671
192.192.125.143 0.0.0.0 255.255.255.255 UH 0 0 0 calie3ba9b96e53
192.192.134.64 172.17.114.22 255.255.255.192 UG 0 0 0 eth0
ip route
at node1:
blackhole 192.192.125.128/26 proto 80
192.192.125.137 dev cali5331bf8f64b scope link
192.192.125.138 dev cali2f01c522b60 scope link
192.192.125.139 dev calie5460d1dd77 scope link
192.192.125.140 dev cali726ebf83671 scope link
192.192.125.143 dev calie3ba9b96e53 scope link
192.192.134.64/26 via 172.17.114.22 dev eth0 proto 80 onlink
route
at node2:
192.192.125.128 172.17.114.24 255.255.255.192 UG 0 0 0 eth0
192.192.134.64 0.0.0.0 255.255.255.192 U 0 0 0 *
192.192.134.71 0.0.0.0 255.255.255.255 UH 0 0 0 califdd333fac30
192.192.134.72 0.0.0.0 255.255.255.255 UH 0 0 0 calie923cf83e08
192.192.134.73 0.0.0.0 255.255.255.255 UH 0 0 0 cali73b51134a71
ip route
at node2:
192.192.125.128/26 via 172.17.114.24 dev eth0 proto 80 onlink
blackhole 192.192.134.64/26 proto 80
192.192.134.71 dev califdd333fac30 scope link
192.192.134.72 dev calie923cf83e08 scope link
192.192.134.73 dev cali73b51134a71 scope link
The following are the operations on node1
I use tcpdump -i calie3ba9b96e53 -nn
. (calie3ba9b96e53: 192.192.125.143)
tcpdump -i calie3ba9b96e53 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on calie3ba9b96e53, link-type EN10MB (Ethernet), capture size 262144 bytes
12:19:12.173454 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4462, length 64
12:19:13.173573 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4463, length 64
12:19:14.173701 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4464, length 64
12:19:15.173793 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4465, length 64
12:19:16.173884 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4466, length 64
12:19:17.173975 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4467, length 64
12:19:18.174075 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4468, length 64
12:19:19.174168 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4469, length 64
12:19:20.174274 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4470, length 64
12:19:21.174391 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4471, length 64
12:19:22.174484 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4472, length 64
12:19:22.420710 ARP, Request who-has 169.254.1.1 tell 192.192.125.143, length 28
12:19:22.420740 ARP, Reply 169.254.1.1 is-at ee:ee:ee:ee:ee:ee, length 28
12:19:23.174589 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4473, length 64
12:19:24.174711 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4474, length 64
I use tcpdump -i eth0 icmp -nn | grep 192.192.125.143
. ( eth0:
192.192.134.64 172.17.114.22 255.255.255.192 UG 0 0 0 eth0)
tcpdump -i eth0 icmp -nn | grep 192.192.125.143
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:19:57.178169 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4507, length 64
12:19:58.178264 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4508, length 64
12:19:59.178383 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4509, length 64
12:20:00.178462 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4510, length 64
12:20:01.178553 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4511, length 64
12:20:02.178654 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4512, length 64
12:20:03.178772 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4513, length 64
12:20:04.178855 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4514, length 64
12:20:05.178947 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4515, length 64
12:20:06.179033 IP 192.192.125.143 > 192.192.134.72: ICMP echo request, id 1, seq 4516, length 64
I don't know where to go next using tcpdump.
The following are the operations on node2
I use tcpdump -i eth0 icmp -nn
(eth0: 192.192.125.128 172.17.114.24 255.255.255.192 UG 0 0 0 eth0). I also use tcpdump -i calie923cf83e08 icmp -nn
(calie923cf83e08: 192.192.134.72).
They have no output.
It seems that tcpdump is missing in the middle, but I don't know where to use tcpdump.
Yep, sounds like the traffic is leaving the source node properly and then being dropped by your network fabric somewhere in the middle before reaching node2.
Where is this environment running?
Also just to confirm, it looks like you don't have any encapsulation enabled on your Calico IP pools - is that right?
I changed the encapsulation in the custom-resources.yaml
file from VXLANCrossSubnet
to VXLAN
and my problem was solved. Thank you very much for your help. @caseydavenport
Cool, glad that helped. For some additional information:
That likely means your nodes were in the same subnet, but that something was dropping the traffic (e.g., reverse path filtering or AWS src/dst checks) because the node was sending traffic with the source / destination of pod IPs.
If you're on Amazon, that would be something like this: https://docs.tigera.io/calico/latest/reference/public-cloud/aws#routing-traffic-within-a-single-vpc-subnet
OK, thanks for your addition.
Expected Behavior
I hope that my k8s cluster can communicate with each other
Current Behavior
My previous k8s cluster was running normally. For some reason, I wanted to re-divide the network segment, so I reset my previous cluster and re-init, but I encountered a problem. Calico seemed to be normal, but it could not communicate with each other. Use
kubectl describe pod -n calico-system calico-node-7cwl2
. There will be warnings, but they will disappear after a while and return to normal warning:normal:
The following is part of calico-node log:
Below is the log of calico-apiserver:
use
calicoctl node status
on master:use
calicoctl node status
on worker:Possible Solution
I have no idea
Steps to Reproduce (for bugs)
Just a normal deployment cluster
kubeadm init --config=my-config.yaml
kubectl create -f tigera-operator.yaml
kubectl create -f custom-resources.yaml
kubectl describe pod -n calico-system calico-node-7cwl2
kubectl run -i --tty --rm debug --image=busybox --restart=Never -- ping [calico-apiserver-ip]
, You can seeYour Environment