IPv4: Neighbour table overflow

ohrensessel commented 10 years ago

Gluon node (ffhh) has a lot of "ipv4: Neighbour table overflow." messages in the log.

This possibly can be fixed by the sysctl settings net.ipv4.neigh.default.gc_thresh1 net.ipv4.neigh.default.gc_thresh2 net.ipv4.neigh.default.gc_thresh3

and for IPv6 (no message regarding that so far but might be good to change it nevertheless) net.ipv6.neigh.default.gc_thresh1 net.ipv6.neigh.default.gc_thresh2 net.ipv6.neigh.default.gc_thresh3

I do not know anything about the memory implications of these changes, but there might be some.

T-X commented 10 years ago

Hm, hm, this is pretty bad news... didn't know about that parameter before. Even if we were changing that parameter in gluon, the issue would still remain for casual Linux clients. Just checked, Debian Sid seems to have the same defaults, and because these are the defaults in the Linux kernel, I believe maybe any Linux distro could have problems despite laptops etc. having more than enough RAM.

Looking at our graph here in Lübeck, with ~230 nodes we are currently peaking at 540 clients:

http://map.ffhl/nodes/globalGraph.png

So it seems to make sense, that Hamburg with its 400+ nodes is now hitting this 1024 clients limit (.gc_thresh3). So it's probably not a malicious DoS attack there in Hamburg.

Even if we were increasing the default value upstream in the Linux kernel that might take some time to trickle down to all these distros (the fastest way would be to declare an increase as a fix, therefore being picked up by the stable kernel releases soon - but that might be difficult to convince the Linux netdev folks). I hate to say that.. but maybe you guys need to add subnets... Damn it, I totally didn't see that limit coming...

You/We should start some discussion somewhere on how to deal with this serious issue.

ohrensessel commented 10 years ago

I think we firstly should look into the consequences of such an overflow. Is that really a problem? From my point of view the clients mostly communicate with a restricted set of other hosts, e.g. gateways or other hosts/nodes/clients offering services. Therefore, they only need the IP to MAC assignment of these hosts, or am I wrong here? An overflowing neighbour table would only cause additional traffic by ARP requests if these specific hosts are dropped from the table. That shouldn't happen (ideally) if they communicate frequently with these hosts?

I had a quick search regarding proxying ARP requests, as this came into my mind. But, what wasn't clear for me, an ARP proxy only answers with its own MAC, not with MACs in his neighbour table. So having an ARP proxy on the freifunk routers is not a solution to reduce traffic by additional ARP requests caused by neighbour table overflow on the clients.

But, as I said above, let's first see whether an overflow in the clients really is a problem or only is relevant if a client also communicates with a lot of other hosts (more than the threshold of the neighbour table, probably an infrequent case).

T-X commented 10 years ago

Okay, checking again, I don't quite see yet, why this overflow exists in your network: Since ARP replies are supposed to be sent via unicast, a host should only end up having entries in its arp/neighbor table for hosts it is actually communicating with. For instance for a host at Freifunk Lübeck (krtek.ffhl) I'm getting this:

$ ip -4 neigh show dev freifunk | wc -l
8

What's your output on the according node for that when the "ipv4: Neighbour table overflow." occurs? Could you provide a tcpdump of ARP packets at that time?

jplitza commented 10 years ago

I second @T-X. In my world view, this shouldn't happen except on the server contacted by all clients or something. I think I remember that this table contains many more entries than ip neigh reveals, so that might not be an appropriate indicator.

This is quite related to my thought of a solution: the node shouldn't need neighbor entries for foreign subnets. batman can choose a gateway based on link metrics. The gateways can distribute addresses from different subnets. Can't the nodes prevent the ARP requests for non-chosen subnets from reaching the clients? (ebtables magic)

neocturne commented 10 years ago

I suspect there might be an issue specific to the network in Hamburg causing this... maybe a broken ARP proxy or something? Dumping and analyzing a few minutes of ARP traffic would give us something to work with, so far we don't know at all what is causing the "problem".

ohrensessel commented 10 years ago

Here are two dumps.

http://www.ohrensessel.net/overflow.pcap was captured yesterday at around 16:50 o'clock when overflow messages appeared in the log.

http://www.ohrensessel.net/nooverflow.pcap was captured right now with no sign of overflow messages.

Please note that these come from different times of the day and therefore might have a different "traffic" volume.

I only had a quick look, but could not find anything particular noticeable. Maybe I can have a deeper look later today.

These were captured by "tcpdump -n -i br-client arp"

T-X commented 10 years ago

Two things I noticed in this dump:

Reoccuring, unncessary ARP requests every five seconds for the same address (10.112.24.1) from the same device (10.112.24.103) - this is probably not causing any trouble
There seem to be quite some HTC (and a few others like Murata/LG/Samsung - so some Android version?) issuing broken ARP replies, in that they use a broadcast address for the reply instead of a unicast one. Which might cause the problem - one way to deal with that would probably be to look up the mac address matching the ARP target IP address and substituting the ARP target mac address and the destination MAC of the ethernet frame with that. Simply dropping these might cause trouble for the troublesome devices themselves. Any one having a simpler idea?

@ohrensessel: Could you have a look whether an additional ebtables rule like this silcences the overflow warning on your gluon node?

ebtables -I INPUT -p ARP --in-interface bat0 --logical-in br-client --arp-opcode 2 --arp-ptype IPv4 --arp-mac-dst ff:ff:ff:ff:ff:ff -j DROP

neocturne commented 10 years ago

I think the gratitous ARP replies are sent after connecting to a network, and android devices that are moving might reconnect quite a lot... We should investigate if these devices are responding with normal unicast ARP replies to ARP requests.

ohrensessel commented 10 years ago

Ok, that are some interesting findings. I did not have time yet to look into the traces, but will do it this evening probably.

One additional information: During times where overflow messages occur, ip -4 neigh | wc -l yields only 20 to 40 entries. (This is a quite busy node with often 10 or slightly more users + 2 other nodes reachable via wifi.) However, I suspected much more to trigger the overflow warning.

Are we maybe seeing something similar to http://wiki.wireshark.org/Gratuitous_ARP here? The link describes a valid use for broadcasting ARP replies and responses, but in general the ARP response as follow-up to a request should be unicast.

I think your solution (substituting the target mac) is a little bit to "hackish". This would require some kind of transparent "ARP proxy". Maybe doing some stateful filtering of ARP responses would be better here? So that only devices which sent out the ARP request do get the reply in the end. So if a node has not seen the outgoing ARP request it will drop the associated reply, even when it is send to the broadcast address.

ohrensessel commented 10 years ago

Regarding your latest comment that you submitted right before I submitted mine: From the traces it is visible that the gratious ARP response from the Android devices is a follow-up to an ARP request received shortly. That doesn't look like they are doing gratious replies because of moving from node to node or something like that.

T-X commented 10 years ago

From what I've read so far, these are not gratious ARP responses since the ARP sender IP differs from the ARP target IP. Therefore I think they are simply malformed, "standard" ARP responses.

neocturne commented 10 years ago

@T_X, can you elaborate a bit more on how these ARP replies look?

4ndr3 commented 10 years ago

I am getting this error message on my wired (VPN-mesh) node only. No such messages on an air-mesh node.

ohrensessel commented 10 years ago

And I am not experiencing these log messages on a raspberry pi which is attached to my wired node as far as I can tell. I will continue observing nevertheless.

4ndr3 commented 10 years ago

I am seeing the same messages pre-gluon.

tcatm commented 10 years ago

It's been quiet here for a few months yet lots of nodes seem to be pretty stable (except for ath9k woes) running Gluon. Is this still an issue? Does it cause problems?

ohrensessel commented 10 years ago

The messages are still appearing on the nodes. I don't know if these messages imply any instability or not.

They are probably realted to a kernel bug where the thresholds are in fact only half of the value that is set. See http://patchwork.ozlabs.org/patch/170385/ With BB this problem should be fixed, as a newer kernel is used.

Some of the nodes seeing the overflow messages reboot very frequently, but this is rather related to OOMs which is a totally different bug...

ohrensessel commented 10 years ago

It seems as if the messages are gone with v2014.3-23-gddd7c16 which is the first BB experimental here in Hamburg. Normally these messages appeared after less uptime (~14 hours so far). I will keep an eye on that to finally confirm that the messages do not appear anymore.

ohrensessel commented 10 years ago

With ~70h uptime, still no appearance of neighbour overflow messages. I would say that this problem is gone with BB, at least for the network size we have now.

neocturne commented 10 years ago

Closing this as it is fixed with BB.

freifunk-gluon / gluon

IPv4: Neighbour table overflow #83