sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
725 stars 1.39k forks source link

VXLAN programming fails with `brcm_sai_create_tunnel_map_entry: Can't create brcm vxlan tunnel` #8371

Open bluecmd opened 3 years ago

bluecmd commented 3 years ago

Description

We are unable to get VXLAN (as part of EVPN) to work on our Trident 3 switches. Following the information we can find, we end up with a syncd process that is failing to program the ASIC.

We have tried on two different platforms from two different vendors, and using both SONiC 202012 and master from a few days ago.

Steps to reproduce the issue:

  1. Start show logging -f
  2. Run:
    config vlan add 1991
    config loopback add Loopback0
    config int ip add Loopback0 10.10.10.10/32
    config vxlan add nve1 10.10.10.10
    config vxlan map add nve1 1991 1991
  3. Watch SAI errors about e.g. create tunnel initiator setup for net port failed with error Invalid parameter (0xfffffffc). and after that vxlan vpn create failed with error Entry exists (0xfffffff8).

Describe the results you received:

This is what show logging -f shows when the above commands are run:

Aug  7 16:48:14.362454 adele NOTICE swss#vxlanmgrd: :- doVxlanTunnelCreateTask: Create vxlan tunnel nve1
Aug  7 16:48:14.365124 adele NOTICE swss#orchagent: :- addOperation: Vxlan tunnel 'nve1' was added
Aug  7 16:48:18.344243 adele NOTICE swss#orchagent: :- create_tunnel: create_tunnel:encapmaplist[0]=0x290000000015ea
Aug  7 16:48:18.347884 adele NOTICE swss#orchagent: :- create_tunnel: create_tunnel:encapmaplist[1]=0x290000000015ec
Aug  7 16:48:18.354767 adele ERR syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_create_vpn:872 create tunnel initiator setup for net port failed with error Invalid parameter (0xfffffffc).
Aug  7 16:48:18.354767 adele ERR syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_enable:1164 create a vxlan decap tunnel failed with error -5.
Aug  7 16:48:18.354767 adele ERR syncd#syncd: [none] SAI_API_TUNNEL:brcm_sai_create_tunnel_map_entry:2633 Can't create brcm vxlan tunnel
Aug  7 16:48:18.354767 adele ERR syncd#syncd: :- sendApiResponse: api SAI_COMMON_API_CREATE failed in syncd mode: SAI_STATUS_FAILURE
Aug  7 16:48:18.354767 adele ERR syncd#syncd: :- processQuadEvent: attr: SAI_TUNNEL_MAP_ENTRY_ATTR_TUNNEL_MAP_TYPE: SAI_TUNNEL_MAP_TYPE_VNI_TO_VLAN_ID
Aug  7 16:48:18.354767 adele ERR syncd#syncd: :- processQuadEvent: attr: SAI_TUNNEL_MAP_ENTRY_ATTR_TUNNEL_MAP: oid:0x290000000015e9
Aug  7 16:48:18.354767 adele ERR syncd#syncd: :- processQuadEvent: attr: SAI_TUNNEL_MAP_ENTRY_ATTR_VLAN_ID_VALUE: 1991
Aug  7 16:48:18.354767 adele ERR syncd#syncd: :- processQuadEvent: attr: SAI_TUNNEL_MAP_ENTRY_ATTR_VNI_ID_KEY: 1991
Aug  7 16:48:18.354767 adele ERR swss#orchagent: :- create: create status: SAI_STATUS_FAILURE

The error transitions into [none] SAI_API_TUNNEL:_brcm_sai_vxlan_create_vxlan_vpn:77 vxlan vpn create failed with error Entry exists (0xfffffff8). which is retried forever many times a second.

Describe the results you expected:

The VXLAN interface was successfully created and programmed to the ASIC.

Output of show version:

Tested on both SONiC.202012.24235-763fcd7ee and SONiC.master.27574-302f88941, or in BCM SAI versions:

Tested on both x86_64-cel_seastone_2-r0 and x86_64-dellemc_s5232f_c3538-r0.

Output of show techsupport:

Let me know if you need this.

Additional information you deem important (e.g. issue happens only occasionally):

A log with SAI_LOG_LEVEL_DEBUG can be found here: https://gist.github.com/bluecmd/7fc0f794a6f85a4d0ba88de29084cb85. We have additionally tested setting up VXLAN in P2P-mode which fails with brcm_sai_vxlan_create_vpn:938 create vxlan acc port failed with error Entry not found (0xfffffff9). (full log at https://gist.github.com/bluecmd/04f2efe3ecc512be0e11ef46f2b6c804). We tested P2P since we see an issue related to P2MP being on the roadmap for 202111.

We have reproduced this issue with an empty configuration (e.g. sonic-cfggen -H -k Seastone_2 --print-data | sudo tee /etc/sonic/config_db.json) followed by the commands listed above as to ensure that no other configuration is causing the issue to appear.

Attaching a debugger to syncd shows that bcm_td2_vxlan_tunnel_initiator_create is called and only after a few instructions directly after that looks like sanity checks returns -4 (Invalid Parameter).

Backtrace:

#0  0x00007f5b2b8afc10 in bcm_td2_vxlan_tunnel_initiator_create () from /usr/lib/libsai.so.1
#1  0x00007f5b2a06c73a in bcm_esw_vxlan_tunnel_initiator_create () from /usr/lib/libsai.so.1
#2  0x00007f5b2ab4ea63 in bcm_vxlan_tunnel_initiator_create () from /usr/lib/libsai.so.1
#3  0x00007f5b29db592b in _brcm_sai_vxlan_enable () from /usr/lib/libsai.so.1
[..]
tushar-ty commented 3 years ago

The following config properties are needed to enable vxlan on TD3:

use_all_splithorizon_groups=1 riot_enable=1 sai_tunnel_support=1 riot_overlay_l3_intf_mem_size=4096 riot_overlay_l3_egress_mem_size=32768 riot_overlay_ecmp_resilient_hash_size=16384 flow_init_mode=1

bluecmd commented 3 years ago

@tushar-ty You just made some folks over here very very happy :D. We applied those lines to one of our TD3 switches, rebooted, and things started working straight away.

Amazing!

bluecmd commented 3 years ago

I'll keep this open for visibility even as the workaround is known; enable SAI tunnel and riot in the .bcm for applicable platforms.

Suggested action to close this issue:

mad4321 commented 2 years ago

Hi, I have same issue on Tomahawk BCM56960

INFO syncd#syncd: [none] SAI_API_TUNNEL:brcm_sai_create_tunnel_map_entry:2467 SAI Enter
DEBUG syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_enable:1118 tunnel_idx = 2
DEBUG syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_enable:1142 term_idx = 5
DEBUG syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_enable:1147 term_idx = 5, tid = 2, valid = 1
INFO syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_enable:1160 create decap tunnel vpn_id=0x00007001 vni=1111 vlan_id=111 tt=2 src=0x0a6f010b dst=0x00000000 vxlan_port=4789
DEBUG syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_create_vxlan_vpn:79 Created vpn: 28673(0x7001) for vni: 1111
DEBUG syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_create_vxlan_vpn:79 Created vpn: 28674(0x7002) for vni: 1048578
ERR syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_create_vpn:872 create tunnel initiator setup for net port failed with error Invalid parameter (0xfffffffc).
ERR syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_enable:1164 create a vxlan decap tunnel failed with error -5.
ERR syncd#syncd: [none] SAI_API_TUNNEL:brcm_sai_create_tunnel_map_entry:2633 Can't create brcm vxlan tunnel

Is there some additional configuration needed for this asic?

alessandro-veras commented 2 years ago

Hey guys. I have probably the same issue in a Celestica Seastone DX010 running over Tomahawk BCM56960. Any suggestions ? Maybe the same solution is applicable here? How can I find and modify the right file?

bratashX commented 1 year ago

Hey guys. I have probably the same issue in a Celestica Seastone DX010 running over Tomahawk BCM56960. Any suggestions ? Maybe the same solution is applicable here? How can I find and modify the right file?

Hi alessandro-veras Do you resolve your problem with Celestica Seastone DX010?

viewtsao commented 4 months ago

The following config properties are needed to enable vxlan on TD3:

use_all_splithorizon_groups=1 riot_enable=1 sai_tunnel_support=1 riot_overlay_l3_intf_mem_size=4096 riot_overlay_l3_egress_mem_size=32768 riot_overlay_ecmp_resilient_hash_size=16384 flow_init_mode=1

hello what should be added to td4/th4 like this , I add this in td4, docker cannot be started