FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.34k stars 1.25k forks source link

EVPN-MH Split Horizon Filters Not Functional #15400

Open zzachattack2 opened 8 months ago

zzachattack2 commented 8 months ago

Description

In my EVPN-MH setup, split horizon filters are not being implemented at the DF -- BUM traffic that is received on a shared segment at the non-DF will flood to the DF and be forwarded back down the shared segment.

Version

FRRouting 9.1 (R1) on Linux(6.6.16-amd64-vyos).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--localstatedir=/var/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--enable-scripting' '--enable-pim6d' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3'

How to reproduce

Configured topology derived in part from the EVPN-MH topotests.

Topology: EVPN-MH Topology

Network config

#Uplink
ip addr add 10.0.1.3/24 dev eth4

#Bridge
ip link add dev br9 type bridge stp_state 0
ip link set dev br9 type bridge vlan_filtering 1
ip link set dev br9 mtu 9216
ip link set dev br9 type bridge ageing_time 1800
ip link set dev br9 type bridge mcast_snooping 0
ip link set dev br9 type bridge vlan_stats_enabled 1
ip link set dev br9 up
bridge vlan add vid 999 dev br9 self

#ES Bond
ip link add dev bond9 type bond mode 802.3ad
ip link set dev bond9 type bond lacp_rate 1
ip link set dev bond9 type bond miimon 100
ip link set dev bond9 type bond xmit_hash_policy layer3+4
ip link set dev bond9 type bond min_links 1
ip link set dev bond9 type bond ad_actor_system fe:64:00:08:7b:00

ip link set dev eth2 down
ip link set dev eth2 master bond9
ip link set dev eth2 up

ip link set dev bond9 up

ip link set dev bond9 master br9
bridge link set dev bond9 priority 8
bridge vlan del vid 1 dev bond9
bridge vlan del vid 1 untagged pvid dev bond9
bridge vlan add vid 999 dev bond9
bridge vlan add vid 999 untagged pvid dev bond9

#VXLAN
ip link add dev vx-999 type vxlan id 999 dstport 4789
ip link set dev vx-999 type vxlan nolearning
ip link set dev vx-999 type vxlan local 10.0.1.3
ip link set dev vx-999 type vxlan ttl 64
ip link set dev vx-999 mtu 9152
ip link set dev vx-999 up

ip link set dev vx-999 master br9
bridge link set dev vx-999 neigh_suppress on
bridge link set dev vx-999 learning off
bridge link set dev vx-999 priority 8
bridge vlan del vid 1 dev vx-999
bridge vlan del vid 1 untagged pvid dev vx-999
bridge vlan add vid 999 dev vx-999
bridge vlan add vid 999 untagged pvid dev vx-999

#SVI
ip link add link br9 name vlan999 type vlan id 999 protocol 802.1q
ip addr add 10.99.0.3/24 dev vlan999
ip link set dev vlan999 up
ip link add link vlan999 name vlan999-v0 type macvlan mode private
ip link set dev vlan999-v0 address 00:00:5e:00:01:99
ip link set dev vlan999-v0 up

ip addr add 10.99.0.1/24 dev vlan999-v0

FRR Config:

frr version 9.1
frr defaults traditional
hostname R1
no ipv6 forwarding
service integrated-vtysh-config
!
interface bond9
 evpn mh es-df-pref 20000
 evpn mh es-id 9
 evpn mh es-sys-mac fe:64:00:08:7b:00
exit
!
interface eth4
 evpn mh uplink
exit
!
router bgp 65000
 bgp router-id 10.0.0.3
 bgp log-neighbor-changes
 no bgp ebgp-requires-policy
 no bgp default ipv4-unicast
 no bgp network import-check
 timers bgp 10 30
 neighbor 10.0.1.4 remote-as 65000
 neighbor 10.0.1.4 update-source 10.0.1.3
 !
 address-family l2vpn evpn
  neighbor 10.0.1.4 activate
  neighbor 10.0.1.4 soft-reconfiguration inbound
  advertise-all-vni
  advertise-svi-ip
 exit-address-family
exit

Network state:

vyos@R1:/$ bridge -d link
34: bond9: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 master br9 state forwarding priority 8 cost 4 
    hairpin off guard off root_block off fastleave off learning on flood on mcast_flood on bcast_flood on mcast_router 1 mcast_to_unicast off neigh_suppress off neigh_vlan_suppress off vlan_tunnel off isolated off locked off mab off mcast_n_groups 0 mcast_max_groups 0 
35: vx-999: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9152 master br9 state forwarding priority 8 cost 100 
    hairpin off guard off root_block off fastleave off learning off flood on mcast_flood on bcast_flood on mcast_router 1 mcast_to_unicast off neigh_suppress on neigh_vlan_suppress off vlan_tunnel off isolated off locked off mab off mcast_n_groups 0 mcast_max_groups 0 
vyos@R1:/$ bridge -d vlan
port              vlan-id  
br9               1 PVID Egress Untagged
                    state forwarding mcast_router 1 neigh_suppress off 
                  999
                    state forwarding mcast_router 1 neigh_suppress off 
bond9             999 PVID Egress Untagged
                    state forwarding mcast_router 1 neigh_suppress off 
vx-999            999 PVID Egress Untagged
                    state forwarding mcast_router 1 neigh_suppress off 
vyos@R1:/$ ip -d link
3: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond9 state UP mode DEFAULT group default qlen 1000
    link/ether ea:55:80:c6:ed:81 brd ff:ff:ff:ff:ff:ff permaddr 00:f0:cb:fe:c3:a3 promiscuity 1 allmulti 1 minmtu 68 maxmtu 9216 
    bond_slave state ACTIVE mii_status UP link_failure_count 3 perm_hwaddr 00:f0:cb:fe:c3:a3 queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 79 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,defaulted> ad_partner_oper_port_state 1 ad_partner_oper_port_state_str <active> numtxqueues 4 numrxqueues 4 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 parentbus pci parentdev 0000:02:00.0 
    altname enp2s0
4: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9152 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:f0:cb:fe:c0:a7 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 9710 numtxqueues 71 numrxqueues 71 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 parentbus pci parentdev 0000:04:00.0 
    altname enp4s0f0
33: br9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 8a:a0:59:68:91:f5 brd ff:ff:ff:ff:ff:ff promiscuity 1 allmulti 0 minmtu 68 maxmtu 65535 
    bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 1800 stp_state 0 priority 32768 vlan_filtering 1 vlan_protocol 802.1Q bridge_id 8000.8a:a0:59:68:91:f5 designated_root 8000.8a:a0:59:68:91:f5 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer    6.36 vlan_default_pvid 1 vlan_stats_enabled 1 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 0 no_linklocal_learn 0 mcast_vlan_snooping 0 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 
34: bond9: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master br9 state UP mode DEFAULT group default qlen 1000
    link/ether ea:55:80:c6:ed:81 brd ff:ff:ff:ff:ff:ff promiscuity 1 allmulti 1 minmtu 68 maxmtu 65535 
    bond mode 802.3ad miimon 100 updelay 0 downdelay 0 peer_notify_delay 0 use_carrier 1 arp_interval 0 arp_missed_max 2 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer3+4 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 1 lp_interval 1 packets_per_slave 1 lacp_active on lacp_rate fast ad_select stable ad_aggregator 1 ad_num_ports 1 ad_actor_key 11 ad_partner_key 1 ad_partner_mac 00:00:00:00:00:00 tlb_dynamic_lb 1 
    bridge_slave state forwarding priority 8 cost 4 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x2001 port_no 0x1 designated_port 8193 designated_cost 0 designated_bridge 8000.8a:a0:59:68:91:f5 designated_root 8000.8a:a0:59:68:91:f5 hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on bcast_flood on mcast_to_unicast off neigh_suppress off neigh_vlan_suppress off group_fwd_mask 0 group_fwd_mask_str 0x0 vlan_tunnel off isolated off locked off mab off numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 
35: vx-999: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9152 qdisc noqueue master br9 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether a6:73:20:51:91:c0 brd ff:ff:ff:ff:ff:ff promiscuity 1 allmulti 1 minmtu 68 maxmtu 65535 
    vxlan id 999 local 10.0.1.3 srcport 0 0 dstport 4789 ttl 64 ageing 300 nolearning 
    bridge_slave state forwarding priority 8 cost 100 hairpin off guard off root_block off fastleave off learning off flood on port_id 0x2002 port_no 0x2 designated_port 8194 designated_cost 0 designated_bridge 8000.8a:a0:59:68:91:f5 designated_root 8000.8a:a0:59:68:91:f5 hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on bcast_flood on mcast_to_unicast off neigh_suppress on neigh_vlan_suppress off group_fwd_mask 0 group_fwd_mask_str 0x0 vlan_tunnel off isolated off locked off mab off numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 
36: vlan999@br9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 8a:a0:59:68:91:f5 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 0 maxmtu 65535 
    vlan protocol 802.1Q id 999 <REORDER_HDR> numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 
37: vlan999-v0@vlan999: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9216 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:00:5e:00:01:99 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 
    macvlan mode private bcqueuelen 1000 usedbcqueuelen 1000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 
R1# show evpn es detail 
ESI: 03:fe:64:00:08:7b:00:00:00:09
 Type: Local,Remote
 Interface: bond9
 State: up
 Bridge port: yes
 Ready for BGP: yes
 VNI Count: 1
 MAC Count: 12
 DF status: df 
 DF preference: 20000
 Nexthop group: 536870913
 VTEPs:
     10.0.1.4 df_alg: preference df_pref: 10000 nh: 268435458

R1# show evpn vni 999
VNI: 999
 Type: L2
 Vlan: 999
 Bridge: br9
 Tenant VRF: default
 VxLAN interface: vx-999
 VxLAN ifIndex: 35
 SVI interface: vlan999
 SVI ifIndex: 36
 Local VTEP IP: 10.0.1.3
 Mcast group: 0.0.0.0
 Remote VTEPs for this VNI:
  10.0.1.4 flood: HER
 Number of MACs (local and remote) known for this VNI: 13
 Number of ARPs (IPv4 and IPv6, local and remote) known for this VNI: 4
 Advertise-gw-macip: No
 Advertise-svi-macip: No

BUM Traffic received at R2 (non-DF) is flooded to R1 (DF), and then forwarded back down the shared segment:

vyos@R2:/$ tcpdump -ennli eth2 inbound 
14:08:27.179443 d8:ec:e5:9d:17:a0 > ff:ff:ff:ff:ff:ff, ethertype Realtek protocols (0x8899), length 60: d8:ec:e5:9d:17:a0 > ff:ff:ff:ff:ff:ff, Realtek unknown type 0x25
14:08:27.975596 14:cb:19:78:8c:80 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 344: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:cb:19:78:8c:80, length 302
14:08:27.983774 14:cb:19:78:8c:80 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 386: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:cb:19:78:8c:80, length 344

vyos@R1:/$ tcpdump -ennli eth2 outbound 
14:08:26.876979 d8:ec:e5:9d:17:a0 > ff:ff:ff:ff:ff:ff, ethertype Realtek protocols (0x8899), length 60: d8:ec:e5:9d:17:a0 > ff:ff:ff:ff:ff:ff, Realtek unknown type 0x25
14:08:27.673456 14:cb:19:78:8c:80 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 344: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:cb:19:78:8c:80, length 302
14:08:27.681135 14:cb:19:78:8c:80 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 386: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:cb:19:78:8c:80, length 344

Expected behavior

Split-horizon filters should prevent traffic flooded from the non-DF VTEP on a shared segment from being forwarded back down the shared segment at the DF VTEP. The codebase indicates the SPH filters are implemented in the data-plane, but it is unclear to me how exactly they are implemented. Are the filters supposed to work with the kernel as the data-plane?

Actual behavior

BUM Traffic received at R2 (non-DF) is flooded to R1 (DF), and then forwarded back down the shared segment:

vyos@R2:/$ tcpdump -ennli eth2 inbound 
14:08:27.179443 d8:ec:e5:9d:17:a0 > ff:ff:ff:ff:ff:ff, ethertype Realtek protocols (0x8899), length 60: d8:ec:e5:9d:17:a0 > ff:ff:ff:ff:ff:ff, Realtek unknown type 0x25
14:08:27.975596 14:cb:19:78:8c:80 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 344: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:cb:19:78:8c:80, length 302
14:08:27.983774 14:cb:19:78:8c:80 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 386: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:cb:19:78:8c:80, length 344

vyos@R1:/$ tcpdump -ennli eth2 outbound 
14:08:26.876979 d8:ec:e5:9d:17:a0 > ff:ff:ff:ff:ff:ff, ethertype Realtek protocols (0x8899), length 60: d8:ec:e5:9d:17:a0 > ff:ff:ff:ff:ff:ff, Realtek unknown type 0x25
14:08:27.673456 14:cb:19:78:8c:80 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 344: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:cb:19:78:8c:80, length 302
14:08:27.681135 14:cb:19:78:8c:80 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 386: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 14:cb:19:78:8c:80, length 344

Additional context

Seems related to discussion towards end of https://github.com/FRRouting/frr/discussions/11487

Checklist

zzachattack2 commented 5 months ago

Can anyone address this? If EVPN-MH support in FRR is incomplete, that is understandable. What is not understandable is FRR pretending it is complete in the documentation. The documentation should ideally be prefaced that EVPN-MH in FRR is only partially complete, and that cumulus or another system is necessary for full functionality. But at the very minimum the following paragraphs should removed from the documentation, given that the statements are completely false.

"BUM traffic is rxed via the overlay by all PEs attached to a server but only the DF can forward the de-capsulated traffic to the access port. To accommodate that non-DF filters are installed in the dataplane to drop the traffic.

Similarly traffic received from ES peers via the overlay cannot be forwarded to the server. This is split-horizon-filtering with local bias."