FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.12k stars 1.2k forks source link

zebra: in L3 BGP-EVPN, when the last route of one family (IPv4 or IPv6) is removed, all routes of the other family become unreachable. #16340

Open crosser opened 4 days ago

crosser commented 4 days ago

Description

When you announce both IPv6 and IPv4 routes in the BGP-EVPN, and then withdraw all routes of one of the families, routes from the other family become unreachable from other hosts that run FRR.

Version

All versions, up to master as of this writing.

How to reproduce

Set up an L3 BGP-EVPN between two hosts (e.g. with vni 16775936). Install some routes, both IPv4 and IPv6, in the corresponding VRF. On the other host, observe that sudo vtysh -e "show evpn next-hops vni 16775936" shows two entries:

Number of NH Neighbors known for this VNI: 2
IP              RMAC             
10.45.16.27     16:21:36:f9:1e:84
::ffff:a2d:101b 16:21:36:f9:1e:84

and sudo vtysh -e "show evpn rmac vni 16775936" shows one entry

Number of Remote RMACs known for this VNI: 1
MAC               Remote VTEP          
16:21:36:f9:1e:84 10.45.16.27

Now remove all routes of one family on the first host, leaving some routes of the other family. Observe that on the second host, show evpn next-hops vni 16775936 displays one entry, as it should, while show evpn rmac vni 16775936 displays no entries.

Expected behavior

Rmac entry should be present as long as there is at least one next-hop entry that uses this remote vtep.

Actual behavior

Rmac entry is removed prematurely, breaking connectivity to the vtep.

Additional context

In L3 BGP-EVPN, if there are both IPv4 and IPv6 routes in the VPN, zebra maintains two instances of struct zebra_neigh object: one with IPv4 address of the nexthop, and another with IPv6 address that is an IPv4 mapped to IPv6, but only one intance of struct zebra_mac object, that contains a list of nexthop addresses that use this mac.

The code in zebra_vxlan module uses the fact that the list is empty as the indication that the zebra_mac object is unused, and needs to be dropped. However, preexisting code used nexthop address converted to IPv4 notation for the element of this list. As a result, when two zebra_neigh objects, one IPv4 and one IPv6-mapped-IPv4 were linked to the zebra_mac object, only one element was added to the list. Consequently, when one of the two zebra_neigh objects was dropped, the only element in the list was removed, making it empty, and zebra_mac object was dropped, and neigbrour cache elements uninstalled from the kernel.

As a result, after the last route in one family was removed from a remote vtep, all remaining routes in the other family became unreachable, because RMAC of the vtep was removed.

Checklist

crosser commented 4 days ago

Suggested fix: MR https://github.com/FRRouting/frr/pull/16341