Closed apiening closed 2 years ago
Sounds like something in your environment is blocking the outbound traffic from the pod network. Have you disabled firewalld, ufw, or anything else that might be interfering with the iptables rules added by the kubelet and CNI? Is there anything odd on your network with regards to MTU or multiple interface configurations?
Hi @brandond, this is a pretty clean install with no firewalls installed or enabled.
# ufw status
Status: inactive
The VM has one single interface that is connected to the local network. I have no connectivity issues from the host itself.
I'm still suffering from this issue.
In order to analyze the connectivity issue I deployed a busybox
container and checked the routes:
# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.42.0.1 0.0.0.0 UG 0 0 0 eth0
10.42.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
10.42.0.0 10.42.0.1 255.255.0.0 UG 0 0 0 eth0
I can ping the gateway:
# ping 10.42.0.1
PING 10.42.0.1 (10.42.0.1): 56 data bytes
64 bytes from 10.42.0.1: seq=0 ttl=64 time=0.048 ms
I can even ping external / public IPs, which I honestly did not expect:
# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=58 time=5.147 ms
But HTTPs get requests, doesn't work:
# time wget https://www.google.com/
Connecting to www.google.com (142.250.185.228:443)
wget: can't connect to remote host (142.250.185.228): Operation timed out
Command exited with non-zero status 1
real 2m 11.31s
user 0m 0.00s
sys 0m 0.00s
Conclusion: I can ping public addresses and DNS is working. But I cannot do HTTPs requests. Any idea what may cause this?
I found an issue that I think is related to my issue: https://github.com/k3s-io/k3s/issues/763
The issue has been closed, even though there was no solution provided. Instead cilium
was used instead of flannel
as a workaround.
Is flannel
generally not advised to be used as the CNI
of choice?
Flannel works fine in pretty much every environment. Those that choose an alternative usually do so in search of a specific feature, not because flannel doesn't work.
Have you done a tcpdump on the host to see what's happening? Do you see the traffic leaving the physical interface? Do you see response packets coming back in? Can you confirm that there's not something outside this host (firewall, etc) blocking your HTTP requests while allowing ICMP and DNS?
Thank you for your hint @brandond: I've done a simple capture with tcpdump -i cni0 -nn -s0 -v -l host <container-ip>
and observed, that the hostname www.google.com
resolved to an IPv6
address. However, the host only had a link-local
address and no IPv6
connectivity to the public internet.
I then disabled IPv6
on the host, rebooted and re-created the busybox
container and the command wget https://www.google.com
once ran successfully. All following tries were failing but every once in a while one request works.
I don't understand this and honestly I'm running out of ideas what I can check or try next.
I have attached the tcpdump
output while doing a wget "https://github.com/samdoran/demo-playbooks"
. I hope you can spot something that may cause this strange issue since I'm not very experienced in reading tcpdump output.
21:51:56.847602 IP (tos 0x0, ttl 64, id 52459, offset 0, flags [DF], proto UDP (17), length 79)
10.42.0.31.36415 > 10.42.0.23.53: 57082+ A? github.com.test.svc.cluster.local. (51)
21:51:56.847604 IP (tos 0x0, ttl 63, id 52459, offset 0, flags [DF], proto UDP (17), length 79)
10.42.0.31.36415 > 10.42.0.23.53: 57082+ A? github.com.test.svc.cluster.local. (51)
21:51:56.847736 IP (tos 0x0, ttl 64, id 22395, offset 0, flags [DF], proto UDP (17), length 172)
10.42.0.23.53 > 10.42.0.31.36415: 57082 NXDomain*- 0/1/0 (144)
21:51:56.847757 IP (tos 0x0, ttl 64, id 22396, offset 0, flags [DF], proto UDP (17), length 172)
10.42.0.23.53 > 10.42.0.31.36415: 57082 NXDomain*- 0/1/0 (144)
21:51:56.847769 IP (tos 0x0, ttl 64, id 52460, offset 0, flags [DF], proto UDP (17), length 79)
10.42.0.31.36415 > 10.42.0.23.53: 57393+ AAAA? github.com.test.svc.cluster.local. (51)
21:51:56.847769 IP (tos 0x0, ttl 63, id 52460, offset 0, flags [DF], proto UDP (17), length 79)
10.42.0.31.36415 > 10.42.0.23.53: 57393+ AAAA? github.com.test.svc.cluster.local. (51)
21:51:56.847837 IP (tos 0x0, ttl 64, id 22397, offset 0, flags [DF], proto UDP (17), length 172)
10.42.0.23.53 > 10.42.0.31.36415: 57393 NXDomain*- 0/1/0 (144)
21:51:56.847873 IP (tos 0x0, ttl 64, id 52461, offset 0, flags [DF], proto UDP (17), length 74)
10.42.0.31.48389 > 10.42.0.23.53: 35453+ A? github.com.svc.cluster.local. (46)
21:51:56.847874 IP (tos 0x0, ttl 64, id 22398, offset 0, flags [DF], proto UDP (17), length 172)
10.42.0.23.53 > 10.42.0.31.36415: 57393 NXDomain*- 0/1/0 (144)
21:51:56.847875 IP (tos 0x0, ttl 63, id 52461, offset 0, flags [DF], proto UDP (17), length 74)
10.42.0.31.48389 > 10.42.0.23.53: 35453+ A? github.com.svc.cluster.local. (46)
21:51:56.847881 IP (tos 0x0, ttl 64, id 52462, offset 0, flags [DF], proto UDP (17), length 74)
10.42.0.31.48389 > 10.42.0.23.53: 35623+ AAAA? github.com.svc.cluster.local. (46)
21:51:56.847882 IP (tos 0x0, ttl 63, id 52462, offset 0, flags [DF], proto UDP (17), length 74)
10.42.0.31.48389 > 10.42.0.23.53: 35623+ AAAA? github.com.svc.cluster.local. (46)
21:51:56.847884 IP (tos 0xc0, ttl 64, id 45260, offset 0, flags [none], proto ICMP (1), length 200)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 36415 unreachable, length 180
IP (tos 0x0, ttl 64, id 22398, offset 0, flags [DF], proto UDP (17), length 172)
10.42.0.23.53 > 10.42.0.31.36415: 57393 NXDomain*- 0/1/0 (144)
21:51:56.847885 IP (tos 0xc0, ttl 63, id 45260, offset 0, flags [none], proto ICMP (1), length 200)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 36415 unreachable, length 180
IP (tos 0x0, ttl 64, id 22398, offset 0, flags [DF], proto UDP (17), length 172)
10.42.0.23.53 > 10.42.0.31.36415: 57393 NXDomain*- 0/1/0 (144)
21:51:56.847943 IP (tos 0x0, ttl 64, id 22399, offset 0, flags [DF], proto UDP (17), length 167)
10.42.0.23.53 > 10.42.0.31.48389: 35623 NXDomain*- 0/1/0 (139)
21:51:56.847996 IP (tos 0x0, ttl 64, id 22400, offset 0, flags [DF], proto UDP (17), length 167)
10.42.0.23.53 > 10.42.0.31.48389: 35453 NXDomain*- 0/1/0 (139)
21:51:56.848047 IP (tos 0x0, ttl 64, id 22401, offset 0, flags [DF], proto UDP (17), length 167)
10.42.0.23.53 > 10.42.0.31.48389: 35453 NXDomain*- 0/1/0 (139)
21:51:56.848047 IP (tos 0x0, ttl 64, id 52463, offset 0, flags [DF], proto UDP (17), length 70)
10.42.0.31.40711 > 10.42.0.23.53: 3858+ A? github.com.cluster.local. (42)
21:51:56.848050 IP (tos 0x0, ttl 63, id 52463, offset 0, flags [DF], proto UDP (17), length 70)
10.42.0.31.40711 > 10.42.0.23.53: 3858+ A? github.com.cluster.local. (42)
21:51:56.848054 IP (tos 0xc0, ttl 64, id 45261, offset 0, flags [none], proto ICMP (1), length 195)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 48389 unreachable, length 175
IP (tos 0x0, ttl 64, id 22401, offset 0, flags [DF], proto UDP (17), length 167)
10.42.0.23.53 > 10.42.0.31.48389: 35453 NXDomain*- 0/1/0 (139)
21:51:56.848056 IP (tos 0xc0, ttl 63, id 45261, offset 0, flags [none], proto ICMP (1), length 195)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 48389 unreachable, length 175
IP (tos 0x0, ttl 64, id 22401, offset 0, flags [DF], proto UDP (17), length 167)
10.42.0.23.53 > 10.42.0.31.48389: 35453 NXDomain*- 0/1/0 (139)
21:51:56.848059 IP (tos 0x0, ttl 64, id 52464, offset 0, flags [DF], proto UDP (17), length 70)
10.42.0.31.40711 > 10.42.0.23.53: 4148+ AAAA? github.com.cluster.local. (42)
21:51:56.848062 IP (tos 0x0, ttl 63, id 52464, offset 0, flags [DF], proto UDP (17), length 70)
10.42.0.31.40711 > 10.42.0.23.53: 4148+ AAAA? github.com.cluster.local. (42)
21:51:56.848086 IP (tos 0x0, ttl 64, id 22402, offset 0, flags [DF], proto UDP (17), length 167)
10.42.0.23.53 > 10.42.0.31.48389: 35623 NXDomain*- 0/1/0 (139)
21:51:56.848092 IP (tos 0xc0, ttl 64, id 45262, offset 0, flags [none], proto ICMP (1), length 195)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 48389 unreachable, length 175
IP (tos 0x0, ttl 64, id 22402, offset 0, flags [DF], proto UDP (17), length 167)
10.42.0.23.53 > 10.42.0.31.48389: 35623 NXDomain*- 0/1/0 (139)
21:51:56.848093 IP (tos 0xc0, ttl 63, id 45262, offset 0, flags [none], proto ICMP (1), length 195)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 48389 unreachable, length 175
IP (tos 0x0, ttl 64, id 22402, offset 0, flags [DF], proto UDP (17), length 167)
10.42.0.23.53 > 10.42.0.31.48389: 35623 NXDomain*- 0/1/0 (139)
21:51:56.848143 IP (tos 0x0, ttl 64, id 22403, offset 0, flags [DF], proto UDP (17), length 163)
10.42.0.23.53 > 10.42.0.31.40711: 4148 NXDomain*- 0/1/0 (135)
21:51:56.848192 IP (tos 0x0, ttl 64, id 22404, offset 0, flags [DF], proto UDP (17), length 163)
10.42.0.23.53 > 10.42.0.31.40711: 3858 NXDomain*- 0/1/0 (135)
21:51:56.848221 IP (tos 0x0, ttl 64, id 22405, offset 0, flags [DF], proto UDP (17), length 163)
10.42.0.23.53 > 10.42.0.31.40711: 3858 NXDomain*- 0/1/0 (135)
21:51:56.848222 IP (tos 0x0, ttl 64, id 52465, offset 0, flags [DF], proto UDP (17), length 56)
10.42.0.31.56357 > 10.42.0.23.53: 52336+ A? github.com. (28)
21:51:56.848224 IP (tos 0x0, ttl 63, id 52465, offset 0, flags [DF], proto UDP (17), length 56)
10.42.0.31.56357 > 10.42.0.23.53: 52336+ A? github.com. (28)
21:51:56.848228 IP (tos 0xc0, ttl 64, id 45263, offset 0, flags [none], proto ICMP (1), length 191)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 40711 unreachable, length 171
IP (tos 0x0, ttl 64, id 22405, offset 0, flags [DF], proto UDP (17), length 163)
10.42.0.23.53 > 10.42.0.31.40711: 3858 NXDomain*- 0/1/0 (135)
21:51:56.848229 IP (tos 0xc0, ttl 63, id 45263, offset 0, flags [none], proto ICMP (1), length 191)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 40711 unreachable, length 171
IP (tos 0x0, ttl 64, id 22405, offset 0, flags [DF], proto UDP (17), length 163)
10.42.0.23.53 > 10.42.0.31.40711: 3858 NXDomain*- 0/1/0 (135)
21:51:56.848233 IP (tos 0x0, ttl 64, id 52466, offset 0, flags [DF], proto UDP (17), length 56)
10.42.0.31.56357 > 10.42.0.23.53: 52556+ AAAA? github.com. (28)
21:51:56.848235 IP (tos 0x0, ttl 63, id 52466, offset 0, flags [DF], proto UDP (17), length 56)
10.42.0.31.56357 > 10.42.0.23.53: 52556+ AAAA? github.com. (28)
21:51:56.848255 IP (tos 0x0, ttl 64, id 22406, offset 0, flags [DF], proto UDP (17), length 163)
10.42.0.23.53 > 10.42.0.31.40711: 4148 NXDomain*- 0/1/0 (135)
21:51:56.848261 IP (tos 0xc0, ttl 64, id 45264, offset 0, flags [none], proto ICMP (1), length 191)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 40711 unreachable, length 171
IP (tos 0x0, ttl 64, id 22406, offset 0, flags [DF], proto UDP (17), length 163)
10.42.0.23.53 > 10.42.0.31.40711: 4148 NXDomain*- 0/1/0 (135)
21:51:56.848262 IP (tos 0xc0, ttl 63, id 45264, offset 0, flags [none], proto ICMP (1), length 191)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 40711 unreachable, length 171
IP (tos 0x0, ttl 64, id 22406, offset 0, flags [DF], proto UDP (17), length 163)
10.42.0.23.53 > 10.42.0.31.40711: 4148 NXDomain*- 0/1/0 (135)
21:51:56.848647 IP (tos 0x0, ttl 64, id 22407, offset 0, flags [DF], proto UDP (17), length 153)
10.42.0.23.53 > 10.42.0.31.56357: 52556 0/1/0 (125)
21:51:56.848703 IP (tos 0x0, ttl 64, id 22408, offset 0, flags [DF], proto UDP (17), length 430)
10.42.0.23.53 > 10.42.0.31.56357: 52336 1/8/0 github.com. A 140.82.121.3 (402)
21:51:56.848755 IP (tos 0x0, ttl 64, id 22409, offset 0, flags [DF], proto UDP (17), length 430)
10.42.0.23.53 > 10.42.0.31.56357: 52336 1/8/0 github.com. A 140.82.121.3 (402)
21:51:56.848760 IP (tos 0x0, ttl 64, id 51637, offset 0, flags [DF], proto TCP (6), length 60)
10.42.0.31.48018 > 140.82.121.3.443: Flags [S], cksum 0x0fcd (incorrect -> 0x3d2b), seq 4244803833, win 64860, options [mss 1410,sackOK,TS val 1811957441 ecr 0,nop,wscale 7], length 0
21:51:56.848762 IP (tos 0xc0, ttl 64, id 45265, offset 0, flags [none], proto ICMP (1), length 458)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 56357 unreachable, length 438
IP (tos 0x0, ttl 64, id 22409, offset 0, flags [DF], proto UDP (17), length 430)
10.42.0.23.53 > 10.42.0.31.56357: 52336 1/8/0 github.com. A 140.82.121.3 (402)
21:51:56.848763 IP (tos 0xc0, ttl 63, id 45265, offset 0, flags [none], proto ICMP (1), length 458)
10.42.0.31 > 10.42.0.23: ICMP 10.42.0.31 udp port 56357 unreachable, length 438
IP (tos 0x0, ttl 64, id 22409, offset 0, flags [DF], proto UDP (17), length 430)
10.42.0.23.53 > 10.42.0.31.56357: 52336 1/8/0 github.com. A 140.82.121.3 (402)
21:51:56.848795 IP (tos 0x0, ttl 64, id 22410, offset 0, flags [DF], proto UDP (17), length 140)
10.42.0.23.53 > 10.42.0.31.56357: 52556 0/1/0 (112)
21:51:57.882086 IP (tos 0x0, ttl 64, id 51638, offset 0, flags [DF], proto TCP (6), length 60)
10.42.0.31.48018 > 140.82.121.3.443: Flags [S], cksum 0x0fcd (incorrect -> 0x3922), seq 4244803833, win 64860, options [mss 1410,sackOK,TS val 1811958474 ecr 0,nop,wscale 7], length 0
21:51:59.894085 IP (tos 0x0, ttl 64, id 51639, offset 0, flags [DF], proto TCP (6), length 60)
10.42.0.31.48018 > 140.82.121.3.443: Flags [S], cksum 0x0fcd (incorrect -> 0x3146), seq 4244803833, win 64860, options [mss 1410,sackOK,TS val 1811960486 ecr 0,nop,wscale 7], length 0
21:52:01.878099 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.42.0.31 tell 10.42.0.23, length 28
21:52:01.878100 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.42.0.1 tell 10.42.0.31, length 28
21:52:01.878106 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.42.0.1 is-at f2:25:4f:3b:53:2c, length 28
21:52:01.878112 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.42.0.31 is-at ea:10:2a:44:ac:eb, length 28
21:52:03.926110 IP (tos 0x0, ttl 64, id 51640, offset 0, flags [DF], proto TCP (6), length 60)
10.42.0.31.48018 > 140.82.121.3.443: Flags [S], cksum 0x0fcd (incorrect -> 0x2186), seq 4244803833, win 64860, options [mss 1410,sackOK,TS val 1811964518 ecr 0,nop,wscale 7], length 0
21:52:12.118101 IP (tos 0x0, ttl 64, id 51641, offset 0, flags [DF], proto TCP (6), length 60)
10.42.0.31.48018 > 140.82.121.3.443: Flags [S], cksum 0x0fcd (incorrect -> 0x0186), seq 4244803833, win 64860, options [mss 1410,sackOK,TS val 1811972710 ecr 0,nop,wscale 7], length 0
21:52:28.246106 IP (tos 0x0, ttl 64, id 51642, offset 0, flags [DF], proto TCP (6), length 60)
10.42.0.31.48018 > 140.82.121.3.443: Flags [S], cksum 0x0fcd (incorrect -> 0xc285), seq 4244803833, win 64860, options [mss 1410,sackOK,TS val 1811988838 ecr 0,nop,wscale 7], length 0
21:52:33.366090 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.42.0.1 tell 10.42.0.31, length 28
21:52:33.366098 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.42.0.1 is-at f2:25:4f:3b:53:2c, length 28
21:53:00.758141 IP (tos 0x0, ttl 64, id 51643, offset 0, flags [DF], proto TCP (6), length 60)
10.42.0.31.48018 > 140.82.121.3.443: Flags [S], cksum 0x0fcd (incorrect -> 0x4385), seq 4244803833, win 64860, options [mss 1410,sackOK,TS val 1812021350 ecr 0,nop,wscale 7], length 0
21:53:05.878080 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.42.0.1 tell 10.42.0.31, length 28
21:53:05.878085 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.42.0.1 is-at f2:25:4f:3b:53:2c, length 28
I see traffic back and forth between the test pod and the dns pod, and I see the test pod sending traffic out to github at 140.82.121.3
but I don't see a reply. Can you see what you see on the actual physical interface? Do you see the response coming back from github? If not then there's likely something going on outside your node.
Thank's again @brandond,
the host where k3s
is running is a VM
. Instead of capturing on cni0
I did a capture on eth0
which is the interface of the VM
and also has the default route set.
Please ignore that the host jumps between 140.82.121.4
and 140.82.121.3
, that is because the github.com domain resolves to different IPs
every few attempts.
# tcpdump -i eth0 -nn -s0 -v -l host 140.82.121.4
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:45:52.073261 IP (tos 0x0, ttl 63, id 3041, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0x2a15), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404832403 ecr 0,nop,wscale 7], length 0
14:45:53.078112 IP (tos 0x0, ttl 63, id 3042, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0x2628), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404833408 ecr 0,nop,wscale 7], length 0
14:45:55.094092 IP (tos 0x0, ttl 63, id 3043, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0x1e48), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404835424 ecr 0,nop,wscale 7], length 0
14:45:59.254096 IP (tos 0x0, ttl 63, id 3044, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0x0e08), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404839584 ecr 0,nop,wscale 7], length 0
14:46:07.450109 IP (tos 0x0, ttl 63, id 3045, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xee03), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404847780 ecr 0,nop,wscale 7], length 0
I did an additional capture on the host
with the actual physical interface and where the VM
with k3s
is running. First on the internal bridge vmbr1
where the VM
is attached to:
# tcpdump -i vmbr1 -nn -s0 -v -l host 140.82.121.4 [16:51:54]
tcpdump: listening on vmbr1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
16:52:18.878518 IP (tos 0x0, ttl 63, id 64964, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.11288 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xceba), seq 951037618, win 64860, options [mss 1410,sackOK,TS val 2405219208 ecr 0,nop,wscale 7], length 0
16:52:19.894581 IP (tos 0x0, ttl 63, id 64965, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.11288 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xcac2), seq 951037618, win 64860, options [mss 1410,sackOK,TS val 2405220224 ecr 0,nop,wscale 7], length 0
16:52:21.910551 IP (tos 0x0, ttl 63, id 64966, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.11288 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xc2e2), seq 951037618, win 64860, options [mss 1410,sackOK,TS val 2405222240 ecr 0,nop,wscale 7], length 0
16:52:26.070570 IP (tos 0x0, ttl 63, id 64967, offset 0, flags [DF], proto TCP (6), length 60)
10.164.12.6.11288 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xb2a2), seq 951037618, win 64860, options [mss 1410,sackOK,TS val 2405226400 ecr 0,nop,wscale 7], length 0
Then I did another capture on the bridge vmbr0
where the physical interface enp7s0
is attached to:
# tcpdump -i vmbr0 -nn -s0 -v -l host 140.82.121.3 [16:55:53]
tcpdump: listening on vmbr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
16:56:00.101072 IP (tos 0x0, ttl 62, id 45737, offset 0, flags [DF], proto TCP (6), length 60)
162.55.245.135.6978 > 140.82.121.3.443: Flags [S], cksum 0x9d43 (incorrect -> 0xd9cf), seq 4022764557, win 64860, options [mss 1410,sackOK,TS val 1873400693 ecr 0,nop,wscale 7], length 0
16:56:01.110397 IP (tos 0x0, ttl 62, id 45738, offset 0, flags [DF], proto TCP (6), length 60)
162.55.245.135.6978 > 140.82.121.3.443: Flags [S], cksum 0x9d43 (incorrect -> 0xd5de), seq 4022764557, win 64860, options [mss 1410,sackOK,TS val 1873401702 ecr 0,nop,wscale 7], length 0
16:56:03.126453 IP (tos 0x0, ttl 62, id 45739, offset 0, flags [DF], proto TCP (6), length 60)
162.55.245.135.6978 > 140.82.121.3.443: Flags [S], cksum 0x9d43 (incorrect -> 0xcdfe), seq 4022764557, win 64860, options [mss 1410,sackOK,TS val 1873403718 ecr 0,nop,wscale 7], length 0
As a last step, I did a capture on enp7s0
:
# tcpdump -i enp7s0 -nn -s0 -v -l host 140.82.121.4 [16:59:11]
tcpdump: listening on enp7s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
16:59:18.536394 IP (tos 0x0, ttl 62, id 54222, offset 0, flags [DF], proto TCP (6), length 60)
162.55.245.135.28100 > 140.82.121.4.443: Flags [S], cksum 0x9d44 (incorrect -> 0x7438), seq 2184615323, win 64860, options [mss 1410,sackOK,TS val 2405638866 ecr 0,nop,wscale 7], length 0
16:59:18.998438 IP (tos 0x0, ttl 62, id 17537, offset 0, flags [DF], proto TCP (6), length 76)
162.55.245.135.35716 > 140.82.121.4.443: Flags [FP.], cksum 0x9d54 (incorrect -> 0x583c), seq 3605376996:3605377020, ack 3707823094, win 2345, options [nop,nop,TS val 2405639328 ecr 2485731493], length 24
16:59:19.546323 IP (tos 0x0, ttl 62, id 54223, offset 0, flags [DF], proto TCP (6), length 60)
162.55.245.135.28100 > 140.82.121.4.443: Flags [S], cksum 0x9d44 (incorrect -> 0x7046), seq 2184615323, win 64860, options [mss 1410,sackOK,TS val 2405639876 ecr 0,nop,wscale 7], length 0
16:59:21.558342 IP (tos 0x0, ttl 62, id 54224, offset 0, flags [DF], proto TCP (6), length 60)
162.55.245.135.28100 > 140.82.121.4.443: Flags [S], cksum 0x9d44 (incorrect -> 0x686a), seq 2184615323, win 64860, options [mss 1410,sackOK,TS val 2405641888 ecr 0,nop,wscale 7], length 0
So the traffic is passing through Flannel
and to the internal bridge, to the external bridge and even the physical network adapter. All stating something like cksum X (incorrect -> Y)
can you tell what that means?
Every 3rd to 5th attempt running wget "https://github.com/samdoran/demo-playbooks"
while doing the capture, the download succeeds. I have repeated the test in this case to document the issue.
I can resolve hostnames and ping
public hosts at all times without a single lost packet.
The wget
command runs on the VM
/ k3s
node and on the host every single time.
Thank's again @brandond,
the host where
k3s
is running is aVM
. Instead of capturing oncni0
I did a capture oneth0
which is the interface of theVM
and also has the default route set.Please ignore that the host jumps between
140.82.121.4
and140.82.121.3
, that is because the github.com domain resolves to differentIPs
every few attempts.# tcpdump -i eth0 -nn -s0 -v -l host 140.82.121.4 tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 14:45:52.073261 IP (tos 0x0, ttl 63, id 3041, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0x2a15), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404832403 ecr 0,nop,wscale 7], length 0 14:45:53.078112 IP (tos 0x0, ttl 63, id 3042, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0x2628), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404833408 ecr 0,nop,wscale 7], length 0 14:45:55.094092 IP (tos 0x0, ttl 63, id 3043, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0x1e48), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404835424 ecr 0,nop,wscale 7], length 0 14:45:59.254096 IP (tos 0x0, ttl 63, id 3044, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0x0e08), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404839584 ecr 0,nop,wscale 7], length 0 14:46:07.450109 IP (tos 0x0, ttl 63, id 3045, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.22483 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xee03), seq 2556866272, win 64860, options [mss 1410,sackOK,TS val 2404847780 ecr 0,nop,wscale 7], length 0
I did an additional capture on the
host
with the actual physical interface and where theVM
withk3s
is running. First on the internal bridgevmbr1
where theVM
is attached to:# tcpdump -i vmbr1 -nn -s0 -v -l host 140.82.121.4 [16:51:54] tcpdump: listening on vmbr1, link-type EN10MB (Ethernet), snapshot length 262144 bytes 16:52:18.878518 IP (tos 0x0, ttl 63, id 64964, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.11288 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xceba), seq 951037618, win 64860, options [mss 1410,sackOK,TS val 2405219208 ecr 0,nop,wscale 7], length 0 16:52:19.894581 IP (tos 0x0, ttl 63, id 64965, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.11288 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xcac2), seq 951037618, win 64860, options [mss 1410,sackOK,TS val 2405220224 ecr 0,nop,wscale 7], length 0 16:52:21.910551 IP (tos 0x0, ttl 63, id 64966, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.11288 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xc2e2), seq 951037618, win 64860, options [mss 1410,sackOK,TS val 2405222240 ecr 0,nop,wscale 7], length 0 16:52:26.070570 IP (tos 0x0, ttl 63, id 64967, offset 0, flags [DF], proto TCP (6), length 60) 10.164.12.6.11288 > 140.82.121.4.443: Flags [S], cksum 0x1c2f (incorrect -> 0xb2a2), seq 951037618, win 64860, options [mss 1410,sackOK,TS val 2405226400 ecr 0,nop,wscale 7], length 0
Then I did another capture on the bridge
vmbr0
where the physical interfaceenp7s0
is attached to:# tcpdump -i vmbr0 -nn -s0 -v -l host 140.82.121.3 [16:55:53] tcpdump: listening on vmbr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 16:56:00.101072 IP (tos 0x0, ttl 62, id 45737, offset 0, flags [DF], proto TCP (6), length 60) 162.55.245.135.6978 > 140.82.121.3.443: Flags [S], cksum 0x9d43 (incorrect -> 0xd9cf), seq 4022764557, win 64860, options [mss 1410,sackOK,TS val 1873400693 ecr 0,nop,wscale 7], length 0 16:56:01.110397 IP (tos 0x0, ttl 62, id 45738, offset 0, flags [DF], proto TCP (6), length 60) 162.55.245.135.6978 > 140.82.121.3.443: Flags [S], cksum 0x9d43 (incorrect -> 0xd5de), seq 4022764557, win 64860, options [mss 1410,sackOK,TS val 1873401702 ecr 0,nop,wscale 7], length 0 16:56:03.126453 IP (tos 0x0, ttl 62, id 45739, offset 0, flags [DF], proto TCP (6), length 60) 162.55.245.135.6978 > 140.82.121.3.443: Flags [S], cksum 0x9d43 (incorrect -> 0xcdfe), seq 4022764557, win 64860, options [mss 1410,sackOK,TS val 1873403718 ecr 0,nop,wscale 7], length 0
As a last step, I did a capture on
enp7s0
:# tcpdump -i enp7s0 -nn -s0 -v -l host 140.82.121.4 [16:59:11] tcpdump: listening on enp7s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 16:59:18.536394 IP (tos 0x0, ttl 62, id 54222, offset 0, flags [DF], proto TCP (6), length 60) 162.55.245.135.28100 > 140.82.121.4.443: Flags [S], cksum 0x9d44 (incorrect -> 0x7438), seq 2184615323, win 64860, options [mss 1410,sackOK,TS val 2405638866 ecr 0,nop,wscale 7], length 0 16:59:18.998438 IP (tos 0x0, ttl 62, id 17537, offset 0, flags [DF], proto TCP (6), length 76) 162.55.245.135.35716 > 140.82.121.4.443: Flags [FP.], cksum 0x9d54 (incorrect -> 0x583c), seq 3605376996:3605377020, ack 3707823094, win 2345, options [nop,nop,TS val 2405639328 ecr 2485731493], length 24 16:59:19.546323 IP (tos 0x0, ttl 62, id 54223, offset 0, flags [DF], proto TCP (6), length 60) 162.55.245.135.28100 > 140.82.121.4.443: Flags [S], cksum 0x9d44 (incorrect -> 0x7046), seq 2184615323, win 64860, options [mss 1410,sackOK,TS val 2405639876 ecr 0,nop,wscale 7], length 0 16:59:21.558342 IP (tos 0x0, ttl 62, id 54224, offset 0, flags [DF], proto TCP (6), length 60) 162.55.245.135.28100 > 140.82.121.4.443: Flags [S], cksum 0x9d44 (incorrect -> 0x686a), seq 2184615323, win 64860, options [mss 1410,sackOK,TS val 2405641888 ecr 0,nop,wscale 7], length 0
So the traffic is passing through
Flannel
and to the internal bridge, to the external bridge and even the physical network adapter. All stating something likecksum X (incorrect -> Y)
can you tell what that means?Every 3rd to 5th attempt running
wget "https://github.com/samdoran/demo-playbooks"
while doing the capture, the download succeeds. I have repeated the test in this case to document the issue. I can resolve hostnames andping
public hosts at all times without a single lost packet. Thewget
command runs on theVM
/k3s
node and on the host every single time.
So packets leave the host correctly. What about the reply? Could you check too?
Could you check the same wget in all levels:
Are you only getting problems when the client is in the pod?
I did run tcpdump on the different levels (host
and vm
) and interfaces with the following command:
tcpdump -i <interface> -nn -s0 -v -l host <container-ip>
From my understanding, host
does filter on the IP-Address
no matter if it is the target or source. So this command should include the reply, since the target would be the container IP, correct?
I can confirm, that the command wget "https://github.com/samdoran/demo-playbooks"
executes on the host
and the vm
with no noticeable delay and the HTML page from github.com is downloaded as expected. From within the busybox
container it looks like this:
# kubectl exec -it busybox -- sh
/ # wget "https://github.com/samdoran/demo-playbooks"
Connecting to github.com (140.82.121.3:443)
wget: can't connect to remote host (140.82.121.3): Operation timed out
Every once in a while the wget
command does work from within the container the same way like it does from the VM
and host
but most of the time it fails with a timeout.
At the same time, I can ping github.com
from within the container reliably with the same round-trip time as on the VM
and host
.
I did run tcpdump on the different levels (
host
andvm
) and interfaces with the following command:tcpdump -i <interface> -nn -s0 -v -l host <container-ip>
From my understanding,
host
does filter on theIP-Address
no matter if it is the target or source. So this command should include the reply, since the target would be the container IP, correct?
This is correct. But in your logs, I can only see the packets egressing the pod on the different levels. Does that mean you don't see reply packets? I want to understand if the problem is non-existing reply packets or reply packets disappearing at some point. It is possible the github is changing its source ip, so It might be better to filter on the port.
I can confirm, that the command
wget "https://github.com/samdoran/demo-playbooks"
executes on thehost
and thevm
with no noticeable delay and the HTML page from github.com is downloaded as expected. From within thebusybox
container it looks like this:# kubectl exec -it busybox -- sh / # wget "https://github.com/samdoran/demo-playbooks" Connecting to github.com (140.82.121.3:443) wget: can't connect to remote host (140.82.121.3): Operation timed out
Every once in a while the
wget
command does work from within the container the same way like it does from theVM
andhost
but most of the time it fails with a timeout. At the same time, I can pinggithub.com
from within the container reliably with the same round-trip time as on theVM
andhost
.
Can you check that there are no other hosts in your network with the same IP of your pod?
Thank you @manuelbuil,
in fact, all I see is what I've posted in https://github.com/k3s-io/k3s/issues/5349#issuecomment-1086022219. But every now and then, when the command finishes successfully, I can see the reply packets in the tcpdump
as well.
Most probably github.com does change its source ip, I do get different IPs returned when I resolve github.com
with DNS
as well. But since the tcpdump
filter is set to the IP of the container, it should be the target
address at the last hop. This is also shown in the event where the command does succeed after a few unsuccessful attempt.
To my knowledge there should be no other host sharing the same IP address of this pod. But I wanted to verify this, because I have another VM
on this host running docker containers
so this might be possible theoretically.
I tried to check this with traceroute
, but if there is a better way, please let me know.
On the VM
:
# tracepath -4 10.42.0.31
1?: [LOCALHOST] pmtu 1450
1: 10.42.0.31 0.046ms reached
1: 10.42.0.31 0.014ms reached
Resume: pmtu 1450 hops 1 back 1
This is what I expected: There is just one hop when I access the container
.
When I do this on the host
:
# traceroute -4 10.42.0.31
traceroute to 10.42.0.31 (10.42.0.31), 30 hops max, 60 byte packets
1 100.91.xx.yy (100.91.xx.yy) 0.486 ms 0.518 ms 0.510 ms
2 core24.fsn1.hetzner.com (213.239.229.109) 0.239 ms 0.592 ms 0.635 ms
3 core12.nbg1.hetzner.com (213.239.203.121) 2.937 ms core11.nbg1.hetzner.com (213.239.245.225) 2.606 ms core12.nbg1.hetzner.com (213.239.203.121) 2.975 ms
4 * * *
5 * * *
6 * * *
7 * * *
8 * * *
...
This seems also fine to me, because there is no route matching the taget address, so the default route is used.
I did the same check with tracepath -4 10.42.0.31
on the other VM
running docker containers
and the default gateway was traced the same way.
So I think this proves that there is no IP address duplication issue at this point.
So, just to recap, ~3/5 times when you try curling github.com (or any http request), you see packets leaving your pod and traversing all the networking stack correctly and egressing the physical NIC correctly (enp7s0). However, you don't see any reply in your physical NIC, right?
The 2/5 times when it works, do you see also the message "cksum 0x9d54 (incorrect -> 0x583c)" in the tcpdump tool? I don't think this is relevant but let's check just in case
Yes that's correct.
I decided to use another url
for my tests because it makes it hard to capture the connection to github.com
when the IP address changes and there are different IPs are responding.
I captured the full output of
tcpdump -nn -s0 -v -l host 193.99.144.85 > dump.txt
while the HTTP request
wget "https://www.heise.de" -O -
succeeded to a file dump.txt
.
I have attached the file, because it is quite long.
I can also see these cksum incorrect
entries in the dump with the successful request.
One explanation I found is that this is normal because of checksum-offloading to the network card. But I can't confirm or negate this. But it doesn't seem to have an effect on this particular issue.
Yes that's correct.
I decided to use another
url
for my tests because it makes it hard to capture the connection togithub.com
when the IP address changes and there are different IPs are responding.I captured the full output of
tcpdump -nn -s0 -v -l host 193.99.144.85 > dump.txt
while the HTTP request
wget "https://www.heise.de" -O -
succeeded to a file
dump.txt
. I have attached the file, because it is quite long.I can also see these
cksum incorrect
entries in the dump with the successful request.One explanation I found is that this is normal because of checksum-offloading to the network card. But I can't confirm or negate this. But it doesn't seem to have an effect on this particular issue.
It is indeed strange. If sometimes http reply packets never reach enp7s0
, it seems to me it might be more a problem in the network. Doing wget/curl from the VM works in 100% of the cases?
Yes it is strange, at least I'm facing the limits of my network analytic skills.
On the same hosts there are several VMs running, including mail servers, web-servers, monitoring-instances and the system is completely monitored with service checks incl. network latency checks and there are no known issues.
I've done a hundred successful wget requests from the VM where the k3s instance is running:
SUCCESS=0; for i in `seq 1 100`; do wget "https://www.heise.de" -O - &> /dev/null && SUCCESS=$((SUCCESS+1)); sleep 1; done; echo $SUCCESS
100
I added the sleep 1
just to not get banned by the web server.
This test finishes with the same results from the host.
I have no idea what I can try next.
There are some posts on the internet complaining about the same or at least very similar issues, for example:
https://discuss.kubernetes.io/t/kubernetes-pods-do-not-have-internet-access/5252
The Topic seems to be still unresolved, ending with a user stating that switching from flanel
to calico
sorted out his issues.
I can't tell anything about it because I'm using docker-compose
for quite a while but have not a lot experience with kubernetes and k3s and the different CNI
s.
I may add, that this is a test node that I've set up to test one application in particular. After facing this issue for the first time I completely wiped the VM and created a fresh one to eliminate possible config issues. But the issue remains the same.
Yes it is strange, at least I'm facing the limits of my network analytic skills.
On the same hosts there are several VMs running, including mail servers, web-servers, monitoring-instances and the system is completely monitored with service checks incl. network latency checks and there are no known issues.
I've done a hundred successful wget requests from the VM where the k3s instance is running:
SUCCESS=0; for i in `seq 1 100`; do wget "https://www.heise.de" -O - &> /dev/null && SUCCESS=$((SUCCESS+1)); sleep 1; done; echo $SUCCESS 100
I added the
sleep 1
just to not get banned by the web server.This test finishes with the same results from the host.
I have no idea what I can try next. There are some posts on the internet complaining about the same or at least very similar issues, for example: https://discuss.kubernetes.io/t/kubernetes-pods-do-not-have-internet-access/5252 The Topic seems to be still unresolved, ending with a user stating that switching from
flanel
tocalico
sorted out his issues. I can't tell anything about it because I'm usingdocker-compose
for quite a while but have not a lot experience with kubernetes and k3s and the differentCNI
s.I may add, that this is a test node that I've set up to test one application in particular. After facing this issue for the first time I completely wiped the VM and created a fresh one to eliminate possible config issues. But the issue remains the same.
There are several things which are very strange:
Let's try one extra thing: Create a busybox pod with hostNetwork: true
. This way, the network namespace of the pod would be the same as the host. In other words, we are completely bypassing the CNI plugin. If this works, we can build the flannel network stack step by step and testing in each step (I can explain you how to do it on Monday ;) )
Thanks again @manuelbuil, that's a good plan.
I tried to bring up a pod with hostNetwork: true
with the following command:
kubectl run busybox2 --image=alpine --overrides='{"kind":"Pod", "apiVersion":"v1", "spec": {"hostNetwork": true}}' --command -- sh -c 'echo Hello K3S! && sleep 3600'
kubectl exec -it busybox2 -- sh
I entered the pod/container and verified, that I do have in fact host networking
.
Then I did the same wget test with 100 tries:
SUCCESS=0; for i in `seq 1 100`; do wget "https://www.heise.de" -O - &> /dev/null && SUCCESS=$((SUCCESS+1)); sleep 1; done; echo $SUCCESS
100
So with hostNetwork: true
all requests are passing without issues. The same way they do from the VM
and the host
.
So this issue must somehow be related to flannel
the one way or the other. Maybe a configuration issue or a bug that happens only under specific circumstances.
Ok. Second test, we will try to build a network namespaces and use flannel infrastructure. Pick two ip addresses in the same range as your cluster-cidr (I think in your case is it 10.164.12.0/26
but please verify with kubectl get nodes -o yaml | grep podCIDR
). Pick two IP addresses which are not used, and then:
# Create network namespaces
sudo ip netns add ns1
sudo ip netns add ns2
# Create veth interfaces
sudo ip link add ns1-namespaceIf type veth peer name ns1-rootIf
sudo ip link set ns1-namespaceIf up
sudo ip link set ns1-rootIf up
sudo ip link add ns2-namespaceIf type veth peer name ns2-rootIf
sudo ip link set ns2-namespaceIf up
sudo ip link set ns2-rootIf up
# Add interfaces in namespaces
sudo ip link set ns1-namespaceIf netns ns1
sudo ip link set ns2-namespaceIf netns ns2
# Make sure ipv4 forwarding works
cat /proc/sys/net/ipv4/conf/all/forwarding
# Add interfaces to bridge
sudo brctl addif cni0 ns1-rootIf
sudo brctl addif cni0 ns2-rootIf
# Add ip to the interfaces
sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf $IP_ADDRESS_YOU_PICKED/24
sudo ip netns exec ns2 ip addr add dev ns2-namespaceIf $IP2_ADDRESS_YOU_PICKED/24
# Add routes
sudo ip netns exec ns1 ip r add default via $IP_INTERFACE_CNI0
sudo ip netns exec ns2 ip r add default via $IP_INTERFACE_CNI0
# Ping should work
sudo ip netns exec ns2 ping $IP_ADDRESS_YOU_PICKED/24
sudo ip netns exec ns1 ping $IP2_ADDRESS_YOU_PICKED/24
# Ping to external should work too
sudo ip netns exec ns2 ping 8.8.8.8
sudo ip netns exec ns1 ping 8.8.8.8
Then try to run the http/https traffic from within the namespaces (sudo ip netns exec curl .....)
Thank you @manuelbuil!
I've verified and set the following variables:
export IP_ADDRESS_YOU_PICKED=10.42.0.101
export IP2_ADDRESS_YOU_PICKED=10.42.0.102
export IP_INTERFACE_CNI0=10.42.0.1
Then I executed all commands from your post in order.
Everything went fine up to the Add routes
section:
# Add routes
sudo ip netns exec ns1 ip r add default via $IP_INTERFACE_CNI0
Error: Nexthop has invalid gateway.
Can you tell what the error message means? Is this already an indication of the network issue that causes my problems?
Thank you @manuelbuil!
I've verified and set the following variables:
export IP_ADDRESS_YOU_PICKED=10.42.0.101 export IP2_ADDRESS_YOU_PICKED=10.42.0.102 export IP_INTERFACE_CNI0=10.42.0.1
Then I executed all commands from your post in order. Everything went fine up to the
Add routes
section:# Add routes sudo ip netns exec ns1 ip r add default via $IP_INTERFACE_CNI0 Error: Nexthop has invalid gateway.
Can you tell what the error message means? Is this already an indication of the network issue that causes my problems?
You are right. I thought that when moving interfaces around, they will be up, but apparently, they are not. Execute these first:
sudo ip netns exec ns1 ip link set ns1-namespaceIf up
sudo ip netns exec ns2 ip link set ns2-namespaceIf up
That should create the routes sudo ip netns exec ns1 ip r
so that it knows how to reach 10.42.0.1
After setting and verifying the routes, I entered the commands you stated under Ping should work
. It didn't run in the first place, but after removing the subnet suffix /24
it did:
# sudo ip netns exec ns2 ping $IP_ADDRESS_YOU_PICKED
PING 10.42.0.101 (10.42.0.101) 56(84) bytes of data.
64 bytes from 10.42.0.101: icmp_seq=1 ttl=64 time=0.075 ms
64 bytes from 10.42.0.101: icmp_seq=2 ttl=64 time=0.028 ms
^C
--- 10.42.0.101 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1022ms
rtt min/avg/max/mdev = 0.028/0.051/0.075/0.023 ms
# sudo ip netns exec ns1 ping $IP2_ADDRESS_YOU_PICKED
PING 10.42.0.102 (10.42.0.102) 56(84) bytes of data.
64 bytes from 10.42.0.102: icmp_seq=1 ttl=64 time=0.045 ms
64 bytes from 10.42.0.102: icmp_seq=2 ttl=64 time=0.056 ms
64 bytes from 10.42.0.102: icmp_seq=3 ttl=64 time=0.033 ms
^C
--- 10.42.0.102 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2031ms
rtt min/avg/max/mdev = 0.033/0.044/0.056/0.009 ms
This is correct so far, right?
The pings to external (8.8.8.8
) succeeded, too.
But I can't figure out how to execute the connection tests with wget
from within this environment:
# sudo ip netns exec wget "https://www.heise.de" -O -
Cannot open network namespace "wget": No such file or directory
wget
is installed on the VM
(and curl
as well) but it doesn't work anyway.
After setting and verifying the routes, I entered the commands you stated under
Ping should work
. It didn't run in the first place, but after removing the subnet suffix/24
it did:# sudo ip netns exec ns2 ping $IP_ADDRESS_YOU_PICKED PING 10.42.0.101 (10.42.0.101) 56(84) bytes of data. 64 bytes from 10.42.0.101: icmp_seq=1 ttl=64 time=0.075 ms 64 bytes from 10.42.0.101: icmp_seq=2 ttl=64 time=0.028 ms ^C --- 10.42.0.101 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1022ms rtt min/avg/max/mdev = 0.028/0.051/0.075/0.023 ms # sudo ip netns exec ns1 ping $IP2_ADDRESS_YOU_PICKED PING 10.42.0.102 (10.42.0.102) 56(84) bytes of data. 64 bytes from 10.42.0.102: icmp_seq=1 ttl=64 time=0.045 ms 64 bytes from 10.42.0.102: icmp_seq=2 ttl=64 time=0.056 ms 64 bytes from 10.42.0.102: icmp_seq=3 ttl=64 time=0.033 ms ^C --- 10.42.0.102 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2031ms rtt min/avg/max/mdev = 0.033/0.044/0.056/0.009 ms
This is correct so far, right? The pings to external (
8.8.8.8
) succeeded, too.But I can't figure out how to execute the connection tests with
wget
from within this environment:# sudo ip netns exec wget "https://www.heise.de" -O - Cannot open network namespace "wget": No such file or directory
wget
is installed on theVM
(andcurl
as well) but it doesn't work anyway.
You forgot to tell what network namespace you want to use (ns1 or ns2) and your OS is thinking that wget is the name of the namespace. The command should be:
sudo ip netns exec ns1 wget "https://www.heise.de" -O -
Ok, thank you.
Here is what I got:
# sudo ip netns exec ns1 wget "https://www.heise.de" -O -
--2022-04-13 07:53:20-- https://www.heise.de/
Resolving www.heise.de (www.heise.de)... failed: Temporary failure in name resolution.
wget: unable to resolve host address ‘www.heise.de’
Same thing for ns2
. Is there some setup regarding DNS required for the network namespaces?
Since I can ping DNS servers (like 8.8.8.8
) this looks like there is no nameserver defined inside the network namespace.
On the VM DNS works fine.
When I enter the network namespace ns1
and check the /etc/resolv.conf
# ip netns exec ns1 bash
# grep -v "#" /etc/resolv.conf
nameserver 127.0.0.53
options edns0 trust-ad
it shows the same result as from within the VM
. That doesn't come as a surprise to me.
But i can't ping the IP 127.0.0.53
from the network namespace ns1
, which I think is the reason why DNS is not working.
The IP 127.0.0.53
is related to the way systemd-resolved
works under Ubuntu 20.04.4 LTS
.
Is is required to somehow route this IP from the network namespace? Or is it possible to define another DNS server for the network namespace?
That's weird. In any case, can you try curling the IP directly so that it does not need to resolve?
When I enter the network namespace
ns1
and check the/etc/resolv.conf
# ip netns exec ns1 bash # grep -v "#" /etc/resolv.conf nameserver 127.0.0.53 options edns0 trust-ad
it shows the same result as from within the
VM
. That doesn't come as a surprise to me. But i can't ping the IP127.0.0.53
from the network namespacens1
, which I think is the reason why DNS is not working.The IP
127.0.0.53
is related to the waysystemd-resolved
works underUbuntu 20.04.4 LTS
. Is is required to somehow route this IP from the network namespace? Or is it possible to define another DNS server for the network namespace?
I could reproduce the problem and it is something related to systemd-resolved. If you stop that service and in /etc/resolv.conf change the nameserver from 127.0.0.53 to something different (e.g. 8.8.8.8), it should resolve correctly
# time sudo ip netns exec ns1 wget --no-check-certificate https://193.99.144.85 -O -
--2022-04-13 15:43:22-- https://193.99.144.85/
Connecting to 193.99.144.85:443... failed: Connection timed out.
Retrying.
--2022-04-13 15:45:34-- (try: 2) https://193.99.144.85/
Connecting to 193.99.144.85:443... failed: Connection refused.
real 2m33.215s
user 0m0.003s
sys 0m0.004s
Same thing for ns2
.
If I repeat the command, every now and then it succeeds.
The behaviour is exactly the same as from within the pod
/ container
despite the DNS
issue.
I could reproduce the problem and it is something related to systemd-resolved. If you stop that service and in /etc/resolv.conf change the nameserver from 127.0.0.53 to something different (e.g. 8.8.8.8), it should resolve correctly
Yes that works. However, the initial issue that the HTTPS request fails persists.
What a strange issue...
Ok, let's try another approach to get rid of some components and make sure they are not introducing anything weird. First, remove the namespaces ip netns del ns1
and ip netns del ns2
Then:
# Create network namespace
sudo ip netns add ns1
# Create veth interfaces
sudo ip link add ns1-namespaceIf type veth peer name ns1-rootIf
sudo ip link set ns1-namespaceIf up
sudo ip link set ns1-rootIf up
# Add interfaces in namespaces
sudo ip link set ns1-namespaceIf netns ns1
# Make sure ipv4 forwarding works
cat /proc/sys/net/ipv4/conf/all/forwarding
cat /proc/sys/net/ipv4/conf/eth0/forwarding
# Enable proxy_arp
sudo echo 1 > /proc/sys/net/ipv4/conf/ns1-rootIf/proxy_arp
# Add ip to the interfaces
sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26
# Add namespace route
sudo ip netns exec ns1 ip link set ns1-namespaceIf up
sudo ip netns exec ns1 ip r add 169.254.1.1 dev ns1-namespaceIf
sudo ip netns exec ns1 ip r add default via 169.254.1.1 dev ns1-namespaceIf
# Set the routes
sudo ip r add 192.168.0.10/32 dev ns1-rootIf
## Access to the internet
sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE
sudo ip netns exec ns1 ping 8.8.8.8
And if that works, try the curl / wget again please
The commands were executed without issues, here is a recap:
~# sudo ip netns add ns1
~# sudo ip link add ns1-namespaceIf type veth peer name ns1-rootIf
~# sudo ip link set ns1-namespaceIf up
~# sudo ip link set ns1-rootIf up
~# sudo ip link set ns1-namespaceIf netns ns1
~# cat /proc/sys/net/ipv4/conf/all/forwarding
1
~# cat /proc/sys/net/ipv4/conf/eth0/forwarding
1
~# cat /proc/sys/net/ipv4/conf/ns1-rootIf/proxy_arp
0
~# sudo echo 1 > /proc/sys/net/ipv4/conf/ns1-rootIf/proxy_arp
~# cat /proc/sys/net/ipv4/conf/ns1-rootIf/proxy_arp
1
~# sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26
~# sudo ip netns exec ns1 ip link set ns1-namespaceIf up
~# sudo ip netns exec ns1 ip r add 169.254.1.1 dev ns1-namespaceIf
~# sudo ip netns exec ns1 ip r add default via 169.254.1.1 dev ns1-namespaceIf
~# sudo ip r add 192.168.0.10/32 dev ns1-rootIf
~# sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE
~# sudo ip netns exec ns1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=58 time=88.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=58 time=5.11 ms
^C
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 5.114/46.706/88.299/41.592 ms
And the command
sudo ip netns exec ns1 wget --no-check-certificate https://193.99.144.85 -O -
succeeds every single time. I tried at least 20 times in a row. So the issue is not happening with this setup.
But what can we take away from this? What are the components we've left out?
The commands were executed without issues, here is a recap:
~# sudo ip netns add ns1 ~# sudo ip link add ns1-namespaceIf type veth peer name ns1-rootIf ~# sudo ip link set ns1-namespaceIf up ~# sudo ip link set ns1-rootIf up ~# sudo ip link set ns1-namespaceIf netns ns1 ~# cat /proc/sys/net/ipv4/conf/all/forwarding 1 ~# cat /proc/sys/net/ipv4/conf/eth0/forwarding 1 ~# cat /proc/sys/net/ipv4/conf/ns1-rootIf/proxy_arp 0 ~# sudo echo 1 > /proc/sys/net/ipv4/conf/ns1-rootIf/proxy_arp ~# cat /proc/sys/net/ipv4/conf/ns1-rootIf/proxy_arp 1 ~# sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26 ~# sudo ip netns exec ns1 ip link set ns1-namespaceIf up ~# sudo ip netns exec ns1 ip r add 169.254.1.1 dev ns1-namespaceIf ~# sudo ip netns exec ns1 ip r add default via 169.254.1.1 dev ns1-namespaceIf ~# sudo ip r add 192.168.0.10/32 dev ns1-rootIf ~# sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE ~# sudo ip netns exec ns1 ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=58 time=88.3 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=58 time=5.11 ms ^C --- 8.8.8.8 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 5.114/46.706/88.299/41.592 ms
And the command
sudo ip netns exec ns1 wget --no-check-certificate https://193.99.144.85 -O -
succeeds every single time. I tried at least 20 times in a row. So the issue is not happening with this setup.
But what can we take away from this? What are the components we've left out?
There are three differences. 1 - In the first case, there is a bridge which does L2 forwarding. In this case, all is L3 2 - In the first case, flannel is taking care of the masquerade. In this case, we are creating those rules 3 - The IP range is different
Let's try a set-up an env which uses a bridge but does not use the flannel masquerading or the flannel ip range. First let's remove what you currently deployed:
sudo ip r del 192.168.0.10/32 dev ns1-rootIf
sudo iptables -t nat -D POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE
sudo ip netns del ns1
Now, the new set-up:
sudo brctl addbr mybr
sudo ip addr add dev mybr 192.168.0.1/26
sudo ip link set mybr up
sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26
sudo ip netns exec ns1 ip link set ns1-namespaceIf up
sudo ip netns exec ns1 ip r add default via 192.168.0.1 dev ns1-namespaceIf
sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE
sudo ip netns exec ns1 ping 8.8.8.8
Then try the curl/wget
If that works, then we know that the problem is not the bridge but either the IP range or iptables. Probably the former. Could you provide the output of ip r
too please?
The command
~# sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26
Cannot open network namespace "ns1": No such file or directory
fails.
Should I do a
~# sudo ip netns add ns1
before this command? Is anything else required?
Here is the requested output:
~# ip r
default via 10.164.12.254 dev eth0 proto static
10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1
10.164.12.0/24 dev eth0 proto kernel scope link src 10.164.12.6
192.168.0.0/26 dev mybr proto kernel scope link src 192.168.0.1
This is after I've applied the three first commands from your post.
I also have a "perhaps-similar-issue" with my k3s worker node in a VM in Contabo. This happens when doing a POST to gitlab.com but based on the issue, this will happen with any outgoing network access:
ERROR: Registering runner... failed runner=GR134894 status=couldn't execute POST against https://gitlab.com/api/v4/runners: Post "https://gitlab.com/api/v4/runners": dial tcp: i/o timeout
PANIC: Failed to register the runner.
using k3s v1.22.7.
this bug only happens with pods inside that node in Contabo VM. However, if I reschedule the pod to the other node which is in Hetzner's instance, all network is working fine.
The command
~# sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26 Cannot open network namespace "ns1": No such file or directory
fails.
Should I do a
~# sudo ip netns add ns1
before this command? Is anything else required?
Yes, sorry:
sudo ip netns add ns1
sudo ip link add ns1-namespaceIf type veth peer name ns1-rootIf
sudo ip link set ns1-namespaceIf up
sudo ip link set ns1-rootIf up
sudo ip link set ns1-namespaceIf netns ns1
sudo brctl addbr mybr
sudo ip addr add dev mybr 192.168.0.1/26
sudo ip link set mybr up
sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26
sudo ip netns exec ns1 ip link set ns1-namespaceIf up
sudo ip netns exec ns1 ip r add default via 192.168.0.1 dev ns1-namespaceIf
sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE
sudo ip netns exec ns1 ping 8.8.8.8
this bug only happens with pods inside that node in Contabo VM. However, if I reschedule the pod to the other node which is in Hetzner's instance, all network is working fine.
Can you try to run a plain alpine
image and try to do a wget
/ curl
request from there manually, just to make sure that this is not related to a specific pod
?
Thanks again @manuelbuil,
I was able to execute the commands you listed without any errors or warnings:
~# sudo ip netns add ns1
~# sudo ip link add ns1-namespaceIf type veth peer name ns1-rootIf
~# sudo ip link set ns1-namespaceIf up
~# sudo ip link set ns1-rootIf up
~# sudo ip link set ns1-namespaceIf netns ns1
~# sudo brctl addbr mybr
~# sudo ip addr add dev mybr 192.168.0.1/26
~# sudo ip link set mybr up
~# sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26
~# sudo ip netns exec ns1 ip link set ns1-namespaceIf up
~# sudo ip netns exec ns1 ip r add default via 192.168.0.1 dev ns1-namespaceIf
~# sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE
~# sudo ip netns exec ns1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
But the ping
commands from ns1
hangs forever, at least no timeout occured within 15 minutes of waiting.
The fact that ping does not work is very different from what I've experienced from within the pod
s. I had issues with HTTPs
request, while ping was working fine all the time.
I think there is some other issue preventing ping to work now. But I don't have any idea what it may be.
@apiening I sidestepped the problem
Current configuration works. So something is wrong with flannel if it's joining a cluster in different cloud.
sudo ip addr add dev mybr 192.168.0.1/26 sudo ip link set mybr up sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26 sudo ip netns exec ns1 ip link set ns1-namespaceIf up sudo ip netns exec ns1 ip r add default via 192.168.0.1 dev ns1-namespaceIf sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE
Thanks again @manuelbuil,
I was able to execute the commands you listed without any errors or warnings:
~# sudo ip netns add ns1 ~# sudo ip link add ns1-namespaceIf type veth peer name ns1-rootIf ~# sudo ip link set ns1-namespaceIf up ~# sudo ip link set ns1-rootIf up ~# sudo ip link set ns1-namespaceIf netns ns1 ~# sudo brctl addbr mybr ~# sudo ip addr add dev mybr 192.168.0.1/26 ~# sudo ip link set mybr up ~# sudo ip netns exec ns1 ip addr add dev ns1-namespaceIf 192.168.0.10/26 ~# sudo ip netns exec ns1 ip link set ns1-namespaceIf up ~# sudo ip netns exec ns1 ip r add default via 192.168.0.1 dev ns1-namespaceIf ~# sudo iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.10 -j MASQUERADE ~# sudo ip netns exec ns1 ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
But the
ping
commands fromns1
hangs forever, at least no timeout occured within 15 minutes of waiting.The fact that ping does not work is very different from what I've experienced from within the
pod
s. I had issues withHTTPs
request, while ping was working fine all the time. I think there is some other issue preventing ping to work now. But I don't have any idea what it may be.
We forgot to add the interface to the bridge :facepalm: sudo brctl addif mybr ns1-rootIf
We forgot to add the interface to the bridge 🤦
sudo brctl addif mybr ns1-rootIf
You're right, it is working fine now:
~# sudo brctl addif mybr ns1-rootIf
~# sudo ip netns exec ns1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=58 time=5.11 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=58 time=5.10 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=58 time=5.13 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=58 time=5.16 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=58 time=5.17 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=58 time=5.09 ms
^C
--- 8.8.8.8 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5006ms
rtt min/avg/max/mdev = 5.085/5.124/5.167/0.029 ms
The command
sudo ip netns exec ns1 wget --no-check-certificate https://193.99.144.85 -O -
Works perfectly fine as well!
Current configuration works. So something is wrong with flannel if it's joining a cluster in different cloud.
This sounds like a different issue, at least I can't see the correlation to what I observe. I'm using a very basic single node setup with k3s
on a clean Ubuntu 20.04 VM
. But once (or I should say if) we figured out what caused this issue where Ping
works from inside the Pod
while HTTPs
does not, it may be interesting to check if this applies to your issue as well.
Also in my previous case
- however the pod doesn't have internet access
Can you please check specifically if
does work or not?
If you pull a basic alpine
image to do this tests, it would be independent from the other pods.
We forgot to add the interface to the bridge facepalm
sudo brctl addif mybr ns1-rootIf
You're right, it is working fine now:
~# sudo brctl addif mybr ns1-rootIf ~# sudo ip netns exec ns1 ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=58 time=5.11 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=58 time=5.10 ms 64 bytes from 8.8.8.8: icmp_seq=3 ttl=58 time=5.13 ms 64 bytes from 8.8.8.8: icmp_seq=4 ttl=58 time=5.16 ms 64 bytes from 8.8.8.8: icmp_seq=5 ttl=58 time=5.17 ms 64 bytes from 8.8.8.8: icmp_seq=6 ttl=58 time=5.09 ms ^C --- 8.8.8.8 ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5006ms rtt min/avg/max/mdev = 5.085/5.124/5.167/0.029 ms
The command
sudo ip netns exec ns1 wget --no-check-certificate https://193.99.144.85 -O -
Works perfectly fine as well!
Ok. Would you be able to deploy k3s but using other ip range? For example: cluster-cidr: 192.168.0.0/16
?
Ok. Would you be able to deploy k3s but using other ip range? For example:
cluster-cidr: 192.168.0.0/16
?
You mean on another VM
as a test?
How can I select a different cluster-cidr
when I deploy k3s
?
Or could I change the cluster-cidr
of the existing deployment?
Thinking of that: Changing the cluster-cidr
and nothing else would only have an effect in case I'm facing a routing or address duplication issue with the currently active cluster-cidr
, right?
Isn't the ping test a proof that the communication generally works and there is no IP-Address issue? We've also excluded DNS
by using direct IP-Addresses.
That's why I'm a little bit sceptic that changing the cluster-cidr
would change that much.
But I can do it anyways. I just don't know how to do it.
Ok. Would you be able to deploy k3s but using other ip range? For example:
cluster-cidr: 192.168.0.0/16
?You mean on another
VM
as a test? How can I select a differentcluster-cidr
when I deployk3s
? Or could I change thecluster-cidr
of the existing deployment?Thinking of that: Changing the
cluster-cidr
and nothing else would only have an effect in case I'm facing a routing or address duplication issue with the currently activecluster-cidr
, right? Isn't the ping test a proof that the communication generally works and there is no IP-Address issue? We've also excludedDNS
by using direct IP-Addresses. That's why I'm a little bit sceptic that changing thecluster-cidr
would change that much. But I can do it anyways. I just don't know how to do it.
Yes, in another VM as a test. Before deploying k3s, create the directory /etc/rancher/k3s/
and place there a config.yaml
file with the content cluster-cidr: 192.168.0.0/16
. Then deploy k3s and it should use that cidr
I have created a new VM
based on Ubuntu 20.04
and created the config file:
~# cat /etc/rancher/k3s/config.yaml
cluster-cidr: 192.168.0.0/16
Then I've deployed k3s
with curl -sfL https://get.k3s.io | sh -
and checked the CIDR
:
~# kubectl describe node | grep PodCIDR
PodCIDR: 192.168.0.0/24
PodCIDRs: 192.168.0.0/24
I've created a test pod with kubectl run busybox --image=alpine --command -- sh -c 'echo Hello K3S! && sleep 3600'
and did the wget
test:
~# kubectl exec -ti busybox -- sh
/ # wget --no-check-certificate https://193.99.144.85 -O -
Connecting to 193.99.144.85 (193.99.144.85:443)
wget: can't connect to remote host (193.99.144.85): Connection refused
So same thing as with the previous install (and the one before). I can ping local and external hosts without issues.
Hi @apiening, Were you able to fix this issue? Please let us know if you are able to. I'm also facing the same issue.
Thanks
Hi @ashissharma97, unfortunately even after trying a lot of things (including another clean install) the issue persists. Every progress on this issue will be documented here.
Environmental Info: K3s Version:
Host OS Version:
IP Forwarding:
Node(s) CPU architecture, OS, and Version:
Cluster Configuration: Single node.
Describe the bug: I cannot connect to the internet from within the pod / container:
Steps To Reproduce: Install one node k3s cluster with
curl -sfL https://get.k3s.io | sh
on aubuntu 20.04
VM. Setup a simple workload (in my case AWX - https://github.com/ansible/awx-operator#basic-install-on-existing-cluster). Enter a container and try to access the internet (for example with curl on a public address).Expected behavior: Accessing the internet should be working the same way like it is from the host.
Actual behavior: No connectivity to the internet from the pod / container at all.
Additional context / logs: