networkop / meshnet-cni

a (K8s) CNI plugin to create arbitrary virtual network topologies
BSD 3-Clause "New" or "Revised" License
116 stars 28 forks source link

PODs connected by veth in the same node #75

Closed yennym3 closed 1 year ago

yennym3 commented 1 year ago

I am deploying a topology with meshnet, where the containers are VM based containers, both are on the same node so a pair of veth interfaces is created to establish a link between the two pods, however when testing connectivity I have no connectivity between the pods, what could it be due to?

PodR1: 60: eth2@if61: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 12:96:dc:06:20:6b brd ff:ff:ff:ff:ff:ff link-netnsid 2 inet 192.168.0.1/24 scope global eth2 valid_lft forever preferred_lft forever inet6 fe80::1096:dcff:fe06:206b/64 scope link

PodR2 61: eth2@if60: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether c6:d9:e4:c7:f5:fb brd ff:ff:ff:ff:ff:ff link-netnsid 1 inet 192.168.0.2/24 scope global eth2 valid_lft forever preferred_lft forever inet6 fe80::c4d9:e4ff:fec7:f5fb/64 scope link valid_lft forever preferred_lft forever

root@r1:/# ping 192.168.0.2 PING 192.168.0.2 (192.168.0.2): 56 data bytes 92 bytes from kne (192.168.0.1): Destination Host Unreachable 92 bytes from kne (192.168.0.1): Destination Host Unreachable 92 bytes from kne (192.168.0.1): Destination Host Unreachable 92 bytes from kne (192.168.0.1): Destination Host Unreachable

kingshukdev commented 1 year ago

"VM based containers" - does it mean you are running a VM inside container ? If so, then can you take two ubuntu containers (docker pull ubuntu) and check if you are able to ping each other.

yennym3 commented 1 year ago

Yes, I have tested with two native containers and if there is connectivity, but when testing the container that wraps a VM is when there is no connectivity between the veth.

kingshukdev commented 1 year ago

Oke native containers works. I have never played with "VM wrapped containers". So can't suggest you anything readily. Meshnet is able to create the interfaces. So what we can check relatively easily is, when you do the ping, if the tx count of the interface is increasing or not ? I am assuming you are doing the ping from the container and not from the in-container VM.

yennym3 commented 1 year ago

Yes, the number of interface tx increases, and yes, the ping is being done from the container.

networkop commented 1 year ago

@yennym3 do you know how VMs are attached to container interfaces? AFAIK the most common ways to do that are using intermediate bridges and using tc-redirect, so it may be a problem in on of those places.

yennym3 commented 1 year ago

@networkop Yes, I am using tc-redirect to connect the virtual machine interfaces to the container interfaces, with these rules:

tc qdisc add dev eth2 ingress tc filter add dev eth2 parent ffff: protocol all u32 match u8 0 0 action mirred egress redirect dev tap2

tc qdisc add dev tap2 ingress tc filter add dev tap2 parent ffff: protocol all u32 match u8 0 0 action mirred egress redirect dev eth2

tap2 interface is created in the container with an ip address that belongs to the subnet of the interface associated to the VM:

3: tap2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc fq_codel state UNKNOWN group default qlen 1000 link/ether 22:d7:cb:c4:4c:ee brd ff:ff:ff:ff:ff:ff inet 192.168.1.1/24 scope global tap2 valid_lft forever preferred_lft forever inet6 fe80::20d7:cbff:fec4:4cee/64 scope link valid_lft forever preferred_lft forever

networkop commented 1 year ago

I think with this setup you're not supposed to be able to ping from within the container's namespace, as packets get redirected to the tap device. You also don't need an IP on the tap interface. You need to assign an IP to the interface corresponding to tap inside the VM and try pinging the other VM from inside of the VM

yennym3 commented 1 year ago

Yes, I have assigned an IP to the VM interface but when pinging from this interface to another interface of another VM there is no connectivity, so I thought there was a relation with not having connectivity between the pods from container to container.

networkop commented 1 year ago

most likely the issue is somewhere in the VMs. you can run tcpdump on the veth or tap interfaces to see the packets going back and forth. as I've mentioned previously, with tc-redirect you won't be able to ping between container interfaces, since tc will redirect the packets to a tap interface before it hits the local IP stack.

yennym3 commented 1 year ago

@networkop I am do ping from VM1 interface (192.168.1.2) to VM2 interface (192.168.2.2) and I see echo request on the container interfaces tap2 eth2, but no echo reply on any of the interfaces, would you know why it could be?, happens in both directions

root@r1:/# tcpdump -i tap2 icmp 07:28:13.538457 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 28, seq 0, length 80 07:28:15.539750 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 28, seq 1, length 80

root@r1:/# tcpdump -i eth2 icmp 07:28:13.538773 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 28, seq 0, length 80 07:28:15.539854 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 28, seq 1, length 80

R1: The address 192.168.1.1.1 corresponds to the tap2 interface, if I don't assign it as you told me I will not see the echo request, r2 is configured the same way

S 192.168.0.0/24 [1/0] via 192.168.1.1 192.168.1.0/24 is variably subnetted, 2 subnets, 2 masks C 192.168.1.0/24 is directly connected, GigabitEthernet2 L 192.168.1.2/32 is directly connected, GigabitEthernet2 S 192.168.2.0/24 [1/0] via 192.168.1.1

the topology is

image
manomugdha commented 1 year ago

Do you see echo reply on eth2/tap2 of r2 when pinging from r1? Please check TTL also.

yennym3 commented 1 year ago

No, I only see echo request

root@r2:/# tcpdump -i eth2 icmp 07:28:13.538942 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 28, seq 0, length 80 07:28:15.539916 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 28, seq 1, length 80

root@r2:/# tcpdump -i tap2 icmp 07:28:13.539125 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 28, seq 0, length 80 07:28:15.539977 IP 192.168.1.2 > 192.168.2.2: ICMP echo request, id 28, seq 1, length 80

TTL is 255:

root@r2:/# tcpdump -i eth2 icmp -vv tcpdump: listening on eth2, link-type EN10MB (Ethernet), capture size 262144 bytes 14:35:16.792488 IP (tos 0x0, ttl 255, id 185, offset 0, flags [none], proto ICMP (1), length 100) 192.168.1.2 > 192.168.2.2: ICMP echo request, id 37, seq 0, length 80 14:35:18.792927 IP (tos 0x0, ttl 255, id 186, offset 0, flags [none], proto ICMP (1), length 100) 192.168.1.2 > 192.168.2.2: ICMP echo request, id 37, seq 1, length 80

networkop commented 1 year ago

aren't these two different subnets?

yennym3 commented 1 year ago

aren't these two different subnets?

yes, there are two subnets

networkop commented 1 year ago

so how are you expecting to get a ping reply?

networkop commented 1 year ago

try changing the subnet on R2 from 192.168.2.0/24 to 192.168.1.0/24 (subnet must match on both sides of a local link) and try doing a ping again

yennym3 commented 1 year ago

so how are you expecting to get a ping reply?

sorry, you're right thanks a lot :) , yes, that was the error