osrg / gobgp

BGP implemented in the Go Programming Language
https://osrg.github.io/gobgp/
Apache License 2.0
3.66k stars 699 forks source link

evpn: fix evpn losing type-2 routes #2804

Closed Tuetuopay closed 6 months ago

Tuetuopay commented 6 months ago

When fixing the EVPN MAC mobility complexity, the way destinations are indexed in the routing table changed from RD+ETAG+MAC+IP to only RD+MAC. This is incorrect per the BGP EVPN RFC. It works in most cases, as when an IP is present, virtually all EVPN implementations will announce two paths: with and without the IP. This way routes announces are balanced and pose no issues.

Issues arise when GoBGP is connected to multiple peers announcing the same things (read: route reflectors), at a high rate, with lots of routes (hundreds of thousands), and if multiple paths exist for the same mac (e.g. with and without an overlay IP address). The issue does not appear time if any of the four above conditions is false.

There, processing ends up racy and over time, some routes end up missing due to the concurrent updates. Such missing routes have been observed with a production setup with:

With this setup, we ended up with a handful of routes missing (usually 10 to 20) after a few days of runtime.

This commit reverts back the custom tableKey implementation done previously, to use the plain String view of the prefix. It is to be noted this is suboptimal performance wise, but is correct.

Fixes: c393f43 ("evpn: fix quadratic evpn mac-mobility handling")

Sorry for introducing this bug in the first place.

fujita commented 6 months ago

pushed, thanks.