sudomesh / bugs

report sudomesh bugs or other issues here if you don't know where to put them
8 stars 2 forks source link

setsockopt out of memory causes babeld failure #24

Open bennlich opened 6 years ago

bennlich commented 6 years ago

Thanks to https://peoplesopen.net/monitor it is now easier to track this. See https://github.com/sudomesh/bugs/issues/8 for early observations.

The Bug

On a fresh boot of the psychz exit node, home nodes dig tunnels, babel babels, and everybody's routing tables get filled with mesh routes. But...

Over time (after about 24-48 hours), routes start to slowly disappear from the routing table, and they don't return until babeld and tunneldigger-broker are restarted on the exit node.

Debugging

This appears to be due to a memory leak in babeld. When the exit node is in the bad state, looking at /var/log/babeld.log during a tunnel connect shows:

Warning: cannot save old configuration for l2tp4061.
setsockopt(IPV6_JOIN_GROUP): Cannot allocate memory
setsockopt(IPV6_LEAVE_GROUP): Cannot assign requested address
Warning: cannot restore old configuration for l2tp4061.

i.e. babeld tries to add the socket to its ipv6 broadcast group and fails due to a memory allocation error.

When the exit node is in a healthy state, no such errors get logged to /var/log/babeld.log, and the mesh routes get added to the routing table as expected.

Conclusion

It looks like there's a socket option memory leak in babeld. I think we're only seeing this bug now in the last month because someone happens to be running a weird node that disconnects and reconnects its tunnel every 5 minutes. You can see this behavior by watching /var/log/syslog on the psychz node for 5 minutes.

Every time the rogue node destroys and recreates a tunnel, the tunneldigger up and down hooks are run, the old tunnel interface is removed from babeld (babeld -x $ifname) and the new tunnel interface is added (babeld -a $ifname).

It seems that removing an interface from babeld does not properly clean up all used memory, and eventually babeld is unable to setsockopt on new sockets.

Todo

Look into socket option memory allocation? Halp!

jhpoelen commented 6 years ago

@bennlich awesome! Perhaps we can hack on reproducing this in an isolated babeld stress test so we can easily know when future fixes are resolving the issue. Happy to help with this, although @juul and others might have more experience with this.

jhpoelen commented 6 years ago

I have installed a babeld-monitor on both the HE and Psychz exit node to detect and apply a workaround for the issue reported in https://github.com/sudomesh/bugs/issues/24 . Using a systemd timer babeld-monitor.timer , babeld log is scanned for specific memory error every 10 minutes. If detected, babeld is restarted and all active tunnel interfaces are re-added to babeld. All things can be observed in the systemd logs. All this is now also added when using create_exitnode via sudomesh/exitnode repo. Please see https://github.com/sudomesh/exitnode/tree/master/src/opt/babeld-monitor and https://github.com/sudomesh/exitnode/tree/master/src/etc/systemd/system if you'd like to learn more about this.

jhpoelen commented 6 years ago

I hope we can remove this hack once the root cause of the babeld error can be found and fixed.

bennlich commented 6 years ago

@jhpoelen nice haxxx! I'm reading up on systemd now... Would love to figure out the root cause too. Raw socket land seems like a daunting land tho. Maybe need to use a phone-a-friend.

bennlich commented 6 years ago

http://git.erp5.org/gitweb/re6stnet.git/commitdiff/d46b09e1d9aca15e98179c5e2e5b0a575dd7f68b?js=1 could be related

bennlich commented 6 years ago

Tried to write a dead-simple stress test today at the software working group with @eenblam and @squeeesh, but we were unable to reproduce the bug. I think our test did not go quite deep enough--an strace of babeld showed that babeld was rarely calling setsockopt(IPV6_JOIN_GROUP).

Probably a better test would involve creating fresh network interfaces and adding to babeld instead of adding/removing my computer's default interface over and over again :-P I'm not sure what's a good way to create a bunch of functional network interfaces...

Also, @eenblam noticed that in the re6stnet commit, they seem to suggest that their fix was to clean up their tunnels less aggressively. So: maybe babeld needs to setsockopt(IPV6_LEAVE_GROUP) /before/ tunneldigger obliterates the network interface. This would make some sense, as setsockopt(IPV6_LEAVE_GROUP) does expect to be passed an interface index (see https://tools.ietf.org/html/rfc3493#section-5.2).

And @squeeesh found this cool and terrifying network stress test lib https://github.com/dtaht/rtod.

bennlich commented 6 years ago

Oh, and if it /is/ a matter of giving babeld a chance to LEAVE_GROUP before tunneldigger destroys the interface, tunneldigger's pre-down hook seems promising, except for the fact that:

the pre-down hook is not guaranteed to complete before the tunnel is shut down.

from https://github.com/wlanslovenija/tunneldigger/blob/master/HISTORY.rst.

(Hook scripts are executed in their own processes.)