FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.27k stars 1.24k forks source link

ZEBRA local mac learning pb with bridge vlan filtering #1771

Closed auranext closed 3 years ago

auranext commented 6 years ago

Hello,

In my VXLAN lab consisting of 2 servers I use BGP/EVPN to maintain the consistency of L2 tables. Each server learns correctly remote and local macs When I enable vlan fitering on a bridge linked to a VXLAN, Zebra is no longer able to learn local MACs. and return : could not find VNI

I think Netlink message is the same when brigde is VLAN filtering enable or disable

here the Zebra logs : /var/log/frr/frr.log.9.gz:Feb 13 06:23:11 HP-GEN9-VXLAN-KVM-deb9 zebra[1316]: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWNEIGH(28), len=76, seq=0, pid=0 /var/log/frr/frr.log.9.gz:Feb 13 06:23:11 HP-GEN9-VXLAN-KVM-deb9 zebra[1316]: Rx RTM_NEWNEIGH family bridge IF vnet0(12) VLAN 100 MAC 52:54:00:7c:d9:9e /var/log/frr/frr.log.9.gz:Feb 13 06:23:11 HP-GEN9-VXLAN-KVM-deb9 zebra[1316]: Add/Update MAC 52:54:00:7c:d9:9e intf vnet0(12) VID 100, could not find VNI

FRR compiled on 2018-01-24 from master branch

Is there a parameter I forgot ? or a bug ?

thx

donaldsharp commented 6 years ago

@mkanjari Can you take a look at this?

mkanjari commented 6 years ago

@auranext could you please enable logs (debug zebra kernel, debug zebra vxlan) and share it for further debugging ?

auranext commented 6 years ago

Hello,

I finally reproduced and more clearly defined the problem. when @mac appears in the bridge the netlink message differs according to the bridge mode used (vlan unaware or aware)

in vlan unaware mode the message is as follows 52:54:00:88:88:88 dev vk61 master br8888 Zebra works fine, @MAC is communicated to BGPD and transmitted to BGP neighbors

in vlan aware mode the message is as follows 52:54:00:88:88:88 dev vk61 vlan 1 master br8888 Zebra return "could not find VNI" and do nothing else.
@MAC is delete after bridge ageout timer Changing the vlan ID has no effect.

As a result Zebra does not know how to handle netlink messages containing a "vlan ID".

attached the log file (debug zebra kernel & vxlan enabled)

frr.log.14.gz

Zebra logs when testing aware/unaware bridge mode

ZEBRA logs in vlan unaware mode Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 zebra[9655]: Rx RTM_NEWNEIGH family bridge IF vk61(27) MAC 52:54:00:88:88:88 Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 zebra[9655]: Add/Update MAC 52:54:00:88:88:88 intf vk61(27) VID 0 -> VNI 8888 Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 zebra[9655]: neigh 10.0.0.40 (MAC 52:54:00:88:88:88) on L2-VNI 8888 is now ACTIVE Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 zebra[9655]: Send MACIP Add flags 0x0 MAC 52:54:00:88:88:88 IP 10.0.0.40 L2-VNI 8888 to bgp Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 zebra[9655]: Send MACIP Add flags 0x0 MAC 52:54:00:88:88:88 IP L2-VNI 8888 to bgp Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 bgpd[9662]: 0:Recv MACIP Add flags 0x0 MAC 52:54:00:88:88:88 IP 10.0.0.40 VNI 8888 Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 bgpd[9662]: 0:Recv MACIP Add flags 0x0 MAC 52:54:00:88:88:88 IP VNI 8888 Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 bgpd[9662]: u1:s1 send UPDATE RD 1:8888 [2]:[52:54:00:88:88:88]:[10.0.0.40]/224 label 555 l2vpn evpn Mar 1 16:55:11 HP-GEN9-VXLAN-KVM-deb9 bgpd[9662]: u1:s1 send UPDATE RD 1:8888 [2]:[52:54:00:88:88:88]/224 label 555 l2vpn evpn

ZEBRA logs in vlan aware mode Mar 1 16:57:09 HP-GEN9-VXLAN-KVM-deb9 zebra[9655]: Rx RTM_NEWNEIGH family bridge IF vk61(28) VLAN 1 MAC 52:54:00:88:88:88 Mar 1 16:57:09 HP-GEN9-VXLAN-KVM-deb9 zebra[9655]: Add/Update MAC 52:54:00:88:88:88 intf vk61(28) VID 1, could not find VNI

auranext commented 6 years ago

@mkanjari with these new elements could you have a look ? thank you

auranext commented 6 years ago

@donaldsharp @mkanjari
hello, Mitesh , I m sorry to insist, would you have forgotten me? would you mind taking a little moment to answer me ? I don't know if it's a malfunction or a misuse, but it's a bit blocking me. thank you

mkanjari commented 6 years ago

@auranext sorry for the late reply. I was caught up with some other deliverables. I will take at look at this today and update.

auranext commented 6 years ago

@mkanjari hello, Mitesh , have you made a diagnosis ?

auranext commented 6 years ago

@mkanjari @donaldsharp
hello, Mitesh , please could you take a moment on this case ? or delegate it ?

mkanjari commented 6 years ago

@auranext: Is there a way I can access the setup ? I have a feeling that something is missing in the way setup is configured ? if you cannot share the setup, can you please share the configs which you are using ? Also the output for 'show evpn vni' will be helpful.

The error 'could not find VNI' shows up when the (port,vlan) on which the mac was learned is not mapped to a vxlan VNI.

auranext commented 6 years ago

@mkanjari,

The setup consists off 6x routers VTEP

The target is to expend VXLAN (L2 forwarding only) between our datacenters with KVM hypervisor VXLAN aware

All works fine with EVPN control plane synced on all VTEPs

HP-GEN9-VXLAN-KVM-deb9# show evpn vni
Advertise gateway mac-ip: No
Number of VNIs: 2
VNI        VxLAN IF              VTEP IP         # MACs   # ARPs   # Remote VTEPs  VRF                                  
9999       vxlan9999             66.6.6.6        1        1        2               Default-IP-Routing-Table             
8888       vxlan8888             66.6.6.6        5        3        4               Default-IP-Routing-Table             
HP-GEN9-VXLAN-KVM-deb9# show evpn vni json 
{
  "advertiseGatewayMacip":"No",
  "numVnis":2,
  "9999":{
    "vxlanIf":"vxlan9999",
    "vtepIp":"66.6.6.6",
    "numMacs":1,
    "numArpNd":1,
    "numRemoteVteps":2,
    "remoteVteps":[
      "55.5.5.5",
      "51.2.1.1"
    ]
  },
  "8888":{
    "vxlanIf":"vxlan8888",
    "vtepIp":"66.6.6.6",
    "numMacs":5,
    "numArpNd":3,
    "numRemoteVteps":4,
    "remoteVteps":[
      "55.5.5.5",
      "53.1.1.1",
      "50.1.1.2",
      "51.2.1.1"
    ]
  }
}

tell me if you want more config files

FRR config :

HP-GEN9-VXLAN-KVM-deb9# show running-config 
Current configuration:
!
frr version 3.1-dev
frr defaults traditional
hostname HP-GEN9-VXLAN-KVM-deb9
no ipv6 forwarding
username auranext nopassword
!
service integrated-vtysh-config
!
debug zebra events
debug zebra packet
debug zebra kernel
debug zebra fpm
debug zebra vxlan
debug bgp neighbor-events
debug bgp updates in
debug bgp updates out
debug bgp zebra
!
log syslog
!
interface eth4
 ip ospf network point-to-point
!
interface eth5
 ip ospf network point-to-point
!
router bgp 400
 bgp router-id 66.6.6.6
 no bgp default ipv4-unicast
 coalesce-time 1000
 neighbor fabric peer-group
 neighbor fabric remote-as 400
 neighbor fabric update-source 66.6.6.6
 neighbor 51.2.0.1 peer-group fabric
 neighbor 51.2.0.2 peer-group fabric
 !
 address-family l2vpn evpn
  neighbor fabric activate
  vni 9999
   rd 1:9999
  exit-vni
  vni 2
   rd 1:2
  exit-vni
  vni 8888
   rd 1:8888
  exit-vni
  advertise-all-vni
 exit-address-family
!
router ospf
 ospf router-id 66.6.6.6
 log-adjacency-changes detail
 network 66.6.6.6/32 area 0
!
line vty
!
end

PROCESSING : VMs are bind to bridge when a VM start KVM network driver pop a (pair) vnet iface and affect it to bridge each bridge contains a (uniq) VXLAN iface

bridge name bridge id       STP enabled interfaces
br8888      8000.4e873ec58ead   no      
                                                        vk2
                            vk3
                            vk666
                            vxlan8888
br9999      8000.f2e0dabd8eee   no      
                                                        vk2b
                            vxlan9999

vlan filtering disabled all works fine, VM MAC is propagated to all VTEPS participating to the respective VXLAN

vlan filtering enabled zebra return can t fine VNI

root@HP-GEN9-VXLAN-KVM-deb9:~# bridge vlan
port                     vlan ids
vxlan8888    4090
vxlan9999    4090
br8888           None
br9999           None
vk3                 1 PVID Egress Untagged
vk2                 1 PVID Egress Untagged
vk2b                    4090
vk666           4090

When VM start I only deal with bridge vlan commands (VM is set to send tagged frames) could you tell me what MAC+VLAN/VXLAN mapping configuration Zebra is expecting ?

mkanjari commented 6 years ago

I think this is the problem: vxlan8888 4090 vxlan9999 4090

You can't have two different Vxlan devices pointing to the same vlan.

Can you also share the /etc/network/interfaces config ?

auranext commented 6 years ago

@mkanjari

You can't have two different Vxlan devices pointing to the same vlan.

I have tested and that make no problem with vlan filtering disabled

The pb occurs with a single GUEST interface declared on a single bridge, a single VXLAN and a single VLAN

the /etc/network/interfaces file

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

source /etc/network/interfaces.d/*

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
allow-hotplug eth0

allow-hotplug eth1
iface eth1 inet manual

allow-hotplug eth2
iface eth2 inet manual

allow-hotplug eth3
iface eth3 inet manual

allow-hotplug eth4
iface eth4 inet manual

allow-hotplug eth5
iface eth5 inet manual

auto eth0
iface eth0 inet static
        address 172.31.242.157/20
        gateway 172.31.240.253
        # dns-* options are implemented by the resolvconf package, if installed
        dns-nameservers 8.8.8.8
        #pre-up /sbin/ethtool --offload eth0 gso off tso off sg off gro off lro off || true

# Tunnel VTEP en mode manuel
auto dummy0
iface dummy0 inet static
        address 66.6.6.6
        netmask 255.255.255.255
        mtu 9000
        # ADD multipath routing hash based on L4 (kernel >= 4.12)
        #pre-up /sbin/ethtool --offload dummy0 gso off tso off sg off gro off lro off || true
        #pre-up /sbin/ethtool --offload eth5 gso off tso off sg off gro off lro off || true
        #pre-up /sbin/ethtool --offload eth4 gso off tso off sg off gro off lro off || true
        pre-up echo 1 > /proc/sys/net/ipv4/fib_multipath_hash_policy || true
        pre-up for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i ; done 2>&1    
        pre-up ip link add dummy0 type dummy || true
        pre-up ip link set eth4 address 98:f2:b3:30:54:68 || true
        pre-up ip link set eth5 address 98:f2:b3:30:54:68 || true
        pre-up ip link set eth4 up || true
        pre-up ip link set eth5 up || true
        pre-up ifconfig eth4 mtu 9000 || true
        pre-up ifconfig eth5 mtu 9000 || true
        up ip a add 66.6.6.6/32 dev eth4 || true
        up ip a add 66.6.6.6/32 dev eth5 || true
        #up ip route add 51.2.0.2/32 dev eth4 src 66.6.6.6 || true
        #up ip route add 51.2.1.1/32 via 51.2.0.2 dev eth4 src 66.6.6.6 || true
        #up ip route add 50.1.1.2/32 via 51.2.0.2 dev eth4 src 66.6.6.6 || true
        up ip link add vxlan8888 type vxlan id 8888 dstport 4789 local 66.6.6.6 nolearning dev dummy0 || true
        #up ip link add vxlan8888 type vxlan id 8888 dstport 4789 local 66.6.6.6 nolearning dev dummy0 || true
        up ip link add vxlan9999 type vxlan id 9999 dstport 4789 local 66.6.6.6 nolearning dev dummy0 || true
        #pre-up /sbin/ethtool --offload vxlan9999 gso off tso off sg off gro off lro off || true
                up brctl addbr br8888 || true
                #pre-up /sbin/ethtool --offload br8888 gso off tso off sg off gro off lro off || true
                up brctl addbr br9999 || true
                #pre-up /sbin/ethtool --offload br9999 gso off tso off sg off gro off lro off || true
                up echo 1 > /sys/class/net/br8888/bridge/vlan_filtering || true
                up echo 1 > /sys/class/net/br9999/bridge/vlan_filtering || true
                up brctl addif br8888 vxlan8888 || true
                up brctl addif br9999 vxlan9999 || true
                up brctl stp br8888 off || true
                up brctl stp br9999 off || true
                up ip link set br8888 up || true
                up ip link set br9999 up || true
                up ip link set vxlan8888 up || true
                up ip link set vxlan9999 up || true
                up bridge vlan del dev br8888 vid 1 pvid untagged self || true
                up bridge vlan del dev br9999 vid 1 pvid untagged self || true
                up bridge vlan del dev vxlan8888 vid 1 pvid untagged || true
                up bridge vlan del dev vxlan9999 vid 1 pvid untagged || true
                up bridge vlan add dev vxlan8888 vid 100 || true
                up bridge vlan add dev vxlan8888 vid 4090 || true
                up bridge vlan add dev vxlan9999 vid 4090 || true
        up echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6 || true
        up echo 1 > /proc/sys/net/ipv4/ip_forward || true
        # [ unnecessary if vlan_filtering=1] We dont want the kernel can respond to ARP requests with addresses from other interfaces.
        #up sysctl -w net.ipv4.conf.all.arp_filter=1 || true
        #down ip route del 51.2.0.2/32 src 66.6.6.6 dev eth4 || true
        #down ip route del 51.2.1.1/32 via 51.2.0.2 dev eth4 || true
        #down ip l set br8888 down && brctl delbr br8888 || true

auto br8888.100
iface br8888.100 inet static
        address 10.0.0.124
        netmask 255.255.255.0

auto br8888.4090
iface br8888.4090 inet static
        address 172.16.0.254
        netmask 255.255.0.0

auto br9999.4090
iface br9999.4090 inet static
        address 192.168.16.254
        netmask 255.255.255.0
auranext commented 6 years ago

@mkanjari

the use of VXLAN on linux seems perfectly functional and it is the traditional usage of VXLAN linux (one VXLAN device per bridge)

  1. I add the VID to the filter table bridge vlan add dev vk666 vid 4090

  2. I declare a MAC in bridge 8888 (refused if previous filter is not set) bridge fdb add 52:54:00:88:88:89 dev vk666 vlan 4090 master vni 8888 dynamic

  3. MAC is added in fdb

    52:54:00:88:88:89 dev vk666 vlan 4090 master br8888 
  4. a netlink message is sent

    bridge mon
    52:54:00:88:88:89 dev vk666 vlan 4090 master br8888 
  5. Zebra listen to netlink message and return CAN T FIND VNI

    Tue 30 15:16:18 HP-GEN9-VXLAN-KVM-deb9 zebra[32464]: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWNEIGH(28), len=76, seq=0, pid=0
    Tue 30 15:16:18 HP-GEN9-VXLAN-KVM-deb9 zebra[32464]: Rx RTM_NEWNEIGH family bridge IF vk666(48) VLAN 4090 MAC 52:54:00:88:88:89
    Mar 30 15:16:18 HP-GEN9-VXLAN-KVM-deb9 zebra[32464]: Add/Update MAC 52:54:00:88:88:89 intf vk666(48) VID 4090, could not find VNI

display tables brctl and bridge vlan

root@HP-GEN9-VXLAN-KVM-deb9:~# brctl show
bridge name bridge id       STP enabled interfaces
br8888      8000.4e873ec58ead   no      vk2
                            vk3
                            vk666
                            vxlan8888
br9999      8000.f2e0dabd8eee   no      vk2b
                            vxlan9999

root@HP-GEN9-VXLAN-KVM-deb9:~# bridge vlan
port    vlan ids
vxlan8888    1 PVID Egress Untagged
     4090

vxlan9999   None
br8888  None
br9999  None
vk3  1 PVID Egress Untagged

vk2  1 PVID Egress Untagged

vk2b    None
vk666    4090

According to this example

Can you explain me what more Zebra is waiting for to make this relationship ?

auranext commented 6 years ago

@mkanjari Hello, sorry, I wrongly closed this case ... Could you tell me your opinion about the previous comment

mkanjari commented 6 years ago

@auranext : This seems like a different way of configuring interfaces then I have seen before. Added @vivek-cumulus to see if he knows how this config works.

vincentbernat commented 6 years ago

FI, I also run into the same problem.

$ bridge link
6: lag1 state UP : <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 master bridge1 state forwarding priority 32 cost 2
7: vni654 state UNKNOWN : <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master bridge1 state forwarding priority 32 cost 100
8: vni655 state UNKNOWN : <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master bridge1 state forwarding priority 32 cost 100
$ bridge vlan
port    vlan ids
lag1     654
         655

vni654   654 Egress Untagged

vni655   655 Egress Untagged

bridge1 None
$ bridge fdb show | grep :07
50:54:33:00:00:07 dev lag1 vlan 654 master bridge1
$ cat /var/log/frr/zebra.log
2018/04/06 22:26:37 ZEBRA: Initializing own label manager
2018/04/06 22:26:37 ZEBRA: zebra 4.1-dev starting: vty@2601
2018/04/06 22:26:37 ZEBRA: client 13 says hello and bids fair to announce only bgp routes vrf=0
2018/04/06 22:26:37 ZEBRA: EVPN VNI Adv enabled, currently disabled
2018/04/06 22:26:37 ZEBRA: Create L2-VNI hash for intf vni654(7) L2-VNI 654 local IP 192.0.2.13
2018/04/06 22:26:37 ZEBRA: Send VNI_ADD 654 192.0.2.13 tenant vrf Default-IP-Routing-Table to bgp
2018/04/06 22:26:37 ZEBRA: Create L2-VNI hash for intf vni655(8) L2-VNI 655 local IP 192.0.2.13
2018/04/06 22:26:37 ZEBRA: Send VNI_ADD 655 192.0.2.13 tenant vrf Default-IP-Routing-Table to bgp
2018/04/06 22:26:37 ZEBRA: client 17 says hello and bids fair to announce only vnc routes vrf=0
2018/04/06 22:26:38 ZEBRA: client 18 says hello and bids fair to announce only ospf routes vrf=0
2018/04/06 22:26:39 ZEBRA: Add/Update MAC 50:54:33:00:00:07 intf lag1(6) VID 654, could not find VNI
2018/04/06 22:26:52 ZEBRA: if_zebra_speed_update: lag1 old speed: 10000 new speed: 20000
mkanjari commented 6 years ago

@vivek-cumulus for comments/edits.

This is not supported: up echo 1 > /sys/class/net/br8888/bridge/vlan_filtering || true up echo 1 > /sys/class/net/br9999/bridge/vlan_filtering || true

It is supported in kernel, but we don't have the corresponding support in frr. We can only have one vlan aware bridge.

A sample config will look like this: auto vx-1000 iface vx-1000 vxlan-id 1000 bridge-access 1000 vxlan-local-tunnelip 27.0.0.9 bridge-learning off bridge-arp-nd-suppress on mstpctl-portbpdufilter yes mstpctl-bpduguard yes mtu 9152

auto vx-1001 iface vx-1001 vxlan-id 1001 bridge-access 1001 vxlan-local-tunnelip 27.0.0.9 bridge-learning off bridge-arp-nd-suppress on mstpctl-portbpdufilter yes mstpctl-bpduguard yes mtu 9152

auto vx-1002 iface vx-1002 vxlan-id 1002 bridge-access 1002 vxlan-local-tunnelip 27.0.0.9 bridge-learning off bridge-arp-nd-suppress on mstpctl-portbpdufilter yes mstpctl-bpduguard yes mtu 9152

auto vx-1003 iface vx-1003 vxlan-id 1003 bridge-access 1003 vxlan-local-tunnelip 27.0.0.9 bridge-learning off bridge-arp-nd-suppress on mstpctl-portbpdufilter yes mstpctl-bpduguard yes mtu 9152

auto vx-1004 iface vx-1004 vxlan-id 1004 bridge-access 1004 vxlan-local-tunnelip 27.0.0.9 bridge-learning off bridge-arp-nd-suppress on mstpctl-portbpdufilter yes mstpctl-bpduguard yes mtu 9152

auto bridge iface bridge bridge-vlan-aware yes bridge-ports vx-1000 vx-1001 vx-1002 vx-1003 vx-1004 hostbond3 hostbond4 bridge-stp on bridge-vids 1000-1004 bridge-pvid 1

vincentbernat commented 6 years ago

Hello,

In my case, I have only one bridge (bridge1).

auranext commented 6 years ago

@mkanjari

as vicentbernat experience, I also noticed that : it doesn't work with just one bridge. Is it possible the error occurs when the setup is running without cumulus tools ?

auranext commented 6 years ago

in other words,

according to your vlan filtering approach I understand that the VXLAN discovery is based on the VLAN-ID, so a VID can only be used with one VNI

from the linux kernel perspective (vlan aware or not) the VXLAN discovery is based on the bridge that the interface is associated with, the VID is not used for the VXLAN discovery, but can be used for control

It's confused for me because without vlan filtering FRR works according to the linux kernel perspective and with vlan filtering FRR seems to adopt the another approach (Cumulus phylosophy ?)

I'm not sure FRR is compliant with linux kernel vlan filtering implementation Please can you provide some clarification.

mkanjari commented 6 years ago

I am not aware of any assumptions which frr makes in terms of vni mapping with port,vlan. @vivek-cumulus might know the details.

aderumier commented 6 years ago

Hi, maybe related:

I was doing test with 2 bridges with vlan aware enable, with a vxlan interface only in 1 bridge, with some vms on the bridge

if the vxlan interface was enabled before starting frr, I can't get local mac address. if I enable the vxlan interface after frr start, I was seeing local mac address on vni. (I think because it send netlink)

with only 1 bridge with vlan aware enable, It's working fine in both cases.

here a sample, with 2 vxlan (vni2 && vni3), mapped to vlan 2 && vlan3

port    vlan ids
vmbr2    1 PVID Egress Untagged

tap102i0     2 PVID Egress Untagged  -> the vm

vxlan2   2 PVID Egress Untagged

vxlan3   3 PVID Egress Untagged
bridge name bridge id       STP enabled interfaces
vmbr2       8000.327c8a27e385   no      tap102i0
                            vxlan2
                            vxlan3
ton31337 commented 4 years ago

@auranext could you try this with the latest release?

ton31337 commented 3 years ago

@polychaeta autoclose in 1 week.