Closed Tuetuopay closed 5 months ago
Looks like the evpn test failed.
Looks like the evpn test failed.
@fujita indeed. Since you're here I would like for your input on the implementation. I am not sure about the behavior when one of the RTs of the route is for an IP-VRF and not just a MAC-VRF (i.e. a route when doing type-5 routing for e.g. a router's mac).
I will fix the test, thanks!
@fujita the evpn tests have been fixed. there was indeed a logic error in my patch due to some place I missed accessing the destinations directly (TableManager directly writing to Table's Destinations without going through table, which is bad isolation). I moved accesses to Destination to Table itself.
The graceful restart tests do fail which I don't understand since I did not touch this part of the code, especially since it's on an address family not affected by the patch (ipv4/6 unicast vs evpn). Those appear to be flaky since when running them locally they don't fail more often than they do. But they still do occasionally.
rebased
Thanks!
The mac mobility code used lookup tables to find the routes to the MAC address. It only considered the MAC itself, and not the associated MAC-VRF. This led to the complexity still being way too high when one MAC is present in all MAC-VRFs, like a virtual router or an anycast gateway.
This patch now considers the MAC-VRF for a route by looking up its route targets, and considers them all if multiple are present. This way, the lookup for routes that will participate in mac-mobility will be limited to the MAC-VRF. This allows to not blow up the CPU when a MAC address is present in a lot of MAC-VRFs.
The biggest behavioral change is now how mac-mobility is sharded. For example, it can now have the same MAC in multiple MAC-VRFs on different PEs. This is quite useful in highly multi-tenant environments where control on the MAC addresses is nigh impossible. This is a net fix of the behavior where MAC-VRFs cannot affect each other. For reference, FRRouting does scope Mac Mobility to the MAC-VRF.
But the main reason this was done is to fix the quadratic complexity that comes when there are many times the same MAC in many different MAC-VRFs, by not needing to iterate through all the routes for other unrelated MAC-VRFs.
A quick benchmark was done by adding type-2 routes through the gRPC API. For 20k routes, on my laptop (Ryzen 5 PRO 4650U):
It is to be noted that, without the patch, the whole route ingestion will be throttled by type-2 route ingestion.
Note: this PR is both a performance and a correctness fix. It is built atop #2804 which is another correctness fix, hence the two commits. Please keep the comments for the first commit in the other pull request