Open Logan007 opened 3 years ago
So far, I have only observed this behavior at remote edges running on an internet VPS. Hexdumps in the code showed that regular ping answers from the VPS host through n2n introduced this so far unknown MAC address as sender MAC.
Maybe the VPS provider just applies some magic in bridging which would add a usually unknown srcMac to host's ping packet answers – even though they get sent through the TAP?
Any similar observations?
I should have stuck to crypto... :exploding_head:
I kind of "fixed" it by providing this exact same strange, seemingly fixed MAC address (provide it through -m ...
command-line parameter). It seems that it just cannot be changed by n2n.
Is there anybody more confident with details on TAP driver's MAC address handling? Is it possible that we see Linux TAP drivers with fixed MAC address here?
I just want to share today's / yesterday's findings of various hexdump
s and printf
s on this issue real quick:
The SIOCSIFHWADDR in tuntap_linux.c
seems (!) to work as SIOCGIFHWADDR delivers the previously set MAC address, even if the SIOCGIFHWADDR is performed after the while
-loop waiting for the device becoming up and running (close ioctl_fd somewhat later then).
But ip addr
command still shows the "strange" MAC address for the newly generated TAP device which still is being used in the packets. If changed using sudo ifconfig edge0 hw ether 10:20:30:40:50:60
command, the MAC address also changes in the packets as well as in ip addr
output.
Furthermore, and this nodges the situation somewhat weirder, a plain SIOCGIFHWADDR (without previous SIOCSIFHWADDR) does not report the "strange" MAC address but some other, so I am not even able to detect that strange MAC address to use it instead then...
The VPS service provide assures me that they have no special setting in place. I have absolutely no idea on how to address this issue.
Argh... :weary: it was the network manager! systemd-networkd
seems to change TAP interface's MAC address shortly (but not immediately) after the TAP device is up. The first packet still went with "our" MAC address, but then it switched. edge's internal data structures were not aware of that change, of course...
I tried a "[Match]-all" .network
file to set MACAddressPolicy
to none, but it did not help. So, I replaced systemd-networkd
with NetworkManager
which has not shown this behavior (so far).
So, quick fix is to not use systemd-networkd
or pre-defined TAP with fixed MAC if supported (not sure). In the long run, I am still uncertain if we want to load the edge with regular MAC address checking (similar to repeated IP address checking in case of true DHCP, just more often, see src/edge_utils.c:2822
) to counter this rare condition. Any opinion?
As it turns out, the journey hasn't ended yet. It only helps once – even with new network manager. It must be something else, maybe some part of the virtualization part for VPS. I get a notion that only dynamic MAC change watching can reliably counter it; I might add it as a feature some time soon.
i think i found a solution to this at https://bugzilla.suse.com/show_bug.cgi?id=1136600
edit [/usr/lib/systemd/network/99-default.link] change [MACAddressPolicy=persistent to MACAddressPolicy=none] run [sudo systemctl restart systemd-networkd]
While testing at a larger scale, I was a bit surprised by ghost entries at edge's management output. Those seem to occur when stopping and re-starting supernodes and / or some edges, no more specific hints available for now. Also, these entries seem to prevent further communication. They show some very strange MAC address and a "0:0:0:0" IP address (edge).
I have already started debugging and would be very interested if you also had similar observations and how they were triggered in your case and how your scenario looks like.