Closed rnurgaliyev closed 1 year ago
This is a tough one as that these error messages are coming from the kernel. I am not aware of what we should even be checking on the kernel side to increase the buffers.
This issue is stale because it has been open 180 days with no activity. Comment or remove the autoclose
label in order to avoid having this issue closed.
This issue will be automatically closed in the specified period unless there is further activity.
Describe the bug
We run a small eVPN with ~1200 type-2 routes. Problematic node peers with two route reflectors via iBGP:
When I do a cold start of FRR, zebra will report lot of following messages:
Most of MAC addresses as expected are not added to the kernel forwarding table.
I was blaming
net.core.rmem_default
,net.core.rmem_max
, and zebras argument--nl-bufsize
. For the test, I've set first two to 200MB, and third one to 16MB. It did not have any effect.What I don't understand is this: if I try to reset BGP peers with
clear bgp *
or simply restart FRR completely (all daemons, zebra, bgpd, etc.) everything will be fine. I don't see any slower rate of netlink messages, everything is more or less the same, but no errors are logged, and all MAC addresses are in the kernel. It only happens during the "cold" start, when the system has just booted up.Errors are visible in zebra data plane statistics:
After zebra restart:
Can someone please give me a hint or point me at the part of the code which I could try to debug?
Versions