Closed italovalcy closed 2 weeks ago
I was reading the Linux Kernel source code for VXLAN interfaces [1], and I saw there is an option called nolearning
, which makes Linux not try to use the remote_addrs from the packet to reply to the tunnel. Instead, Linux will use the IP address which is configured during the tunnel setup. Learning mode is enabled by default, so we should disable it to avoid this behavior.
[1]https://elixir.bootlin.com/linux/v5.4/source/drivers/net/vxlan.c
Hi,
Testing Mininet-Sec in a certain Kubernetes host ended up highlighting one corner case for VXLAN Links where the Link does not work as expected due to SNAT being applied on the network between pods.
Basically, if you sniffer the traffic between the two K8s hosts, you will notice the following behavior:
Host 1:
Host 3:
As you can see above, the VXLAN tunnel is created between 10.xx.xx.162 <-> 10.xx.xx.173, however, if you run a ping from the xlan interface and leave TCPDUMP running on the other side (tcpdump is running on Host 3):
As you can see above, Host 3 tries to reply the VXLAN tunnel with the source IP address seen on the request packet, instead of the IP address actually configured.
Workaround: you can always force the source IP address to be overwritten using netfilter/nat/SNAT actions, but definitely that is not a good approach.
Despite the fact that this seems to be an error on the configuration of the Kubernetes cluster (especially because each node has a valid routing address schema between, so it wouldn't need SNAT at all when communicating between them), Mininet-Sec should be robust enough to avoid this behavior.