sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
746 stars 1.44k forks source link

VXLAN not coming up with error unexpected type: SAI_OBJECT_TYPE_TUNNEL #10004

Open aseaudi opened 2 years ago

aseaudi commented 2 years ago

Description

I created a Vxlan EVPN tunnel between 2 sonic switches using the Loopback interfaces as SRC and DST, but it didn't work.

Steps to reproduce the issue:

Configure OSPF and BGP EVPN between 2 sonic switches, and ensure Loopback0 is in the routing tables. Create Vlan 50 and add it as tagged egress pvid on a port on each switch. Connect 2 hosts on each switch.

Describe the results you received:

I get the following error in the syslog,

Aug 16 03:14:19.373668 Odin-ec58a ERR swss#orchagent: :- meta_sai_on_port_state_change_single: data.port_id oid:0x2a0000000009bf has unexpected type: SAI_OBJECT_TYPE_TUNNEL, expected PORT, BRIDGE_PORT or LAG
Aug 16 03:14:19.375495 Odin-ec58a NOTICE swss#orchagent: :- doTask: Get port state change notification id:2a0000000009bf status:1
Aug 16 03:14:19.375495 Odin-ec58a ERR swss#orchagent: :- doTask: Failed to get port object for port id 0x2a0000000009bf

root@Odin-ec58a:~# show vxlan remotevtep
+---------+---------+-------------------+--------------+
| SIP     | DIP     | Creation Source   | OperStatus   |
+=========+=========+===================+==============+
| 2.2.2.2 | 1.1.1.1 | EVPN              | oper_down    |
+---------+---------+-------------------+--------------+
Total count : 1 

root@Odin-ec58a:~#

FRR CONFIG:
router bgp 65000
 bgp router-id 2.2.2.2
 neighbor 1.1.1.1 remote-as 65000
 neighbor 1.1.1.1 update-source Loopback0
 !
 address-family l2vpn evpn
  neighbor 1.1.1.1 activate
  advertise-all-vni
 exit-address-family
!
interface Ethernet0
 ip ospf area 0
 ip ospf network point-to-point
!
interface Ethernet1
 ip ospf area 0
 ip ospf network point-to-point
!
interface Loopback0
 ip ospf area 0
!
router ospf
 ospf router-id 2.2.2.2

Describe the results you expected:

Expected that the VXLAN oper_status is up, and connectivity over the vxlan to work, but it does not work

Output of show version:

SONiC Software Version: SONiC.202012.65487-3ef3e3c56
Distribution: Debian 10.11
Kernel: 4.19.0-12-2-amd64
Build commit: 3ef3e3c56
Build date: Sun Jan 16 14:28:36 UTC 2022
Built by: AzDevOps@sonic-build-workers-0012V5

Platform: x86_64-accton_as5835_54x-r0
HwSKU: Accton-AS5835-54X
ASIC: broadcom
ASIC Count: 1
Serial Number: 583554X2133034
Uptime: 15:14:45 up 20 days,  1:29,  1 user,  load average: 0.60, 0.66, 0.71

Docker images:
REPOSITORY                    TAG                      IMAGE ID            SIZE
docker-sonic-mgmt-framework   202012.65487-3ef3e3c56   e7eb1d69029f        786MB
docker-sonic-mgmt-framework   latest                   e7eb1d69029f        786MB
docker-sonic-telemetry        202012.65487-3ef3e3c56   789b1aa5b41b        462MB
docker-sonic-telemetry        latest                   789b1aa5b41b        462MB
docker-fpm-frr                202012.65487-3ef3e3c56   ec4c9b2d4d2c        401MB
docker-fpm-frr                latest                   ec4c9b2d4d2c        401MB
docker-sflow                  202012.65487-3ef3e3c56   e044f450cfc8        384MB
docker-sflow                  latest                   e044f450cfc8        384MB
docker-nat                    202012.65487-3ef3e3c56   419e6c4978b2        386MB
docker-nat                    latest                   419e6c4978b2        386MB
docker-teamd                  202012.65487-3ef3e3c56   56fbbe4b29af        383MB
docker-teamd                  latest                   56fbbe4b29af        383MB
docker-platform-monitor       202012.65487-3ef3e3c56   3220856da746        554MB
docker-platform-monitor       latest                   3220856da746        554MB
docker-orchagent              202012.65487-3ef3e3c56   5ddc0d683c5c        402MB
docker-orchagent              latest                   5ddc0d683c5c        402MB
docker-snmp                   202012.65487-3ef3e3c56   9105e140a330        414MB
docker-snmp                   latest                   9105e140a330        414MB
docker-syncd-brcm             202012.65487-3ef3e3c56   de1fd4ba616b        665MB
docker-syncd-brcm             latest                   de1fd4ba616b        665MB
docker-lldp                   202012.65487-3ef3e3c56   76833b88bfa6        412MB
docker-lldp                   latest                   76833b88bfa6        412MB
docker-router-advertiser      202012.65487-3ef3e3c56   ab9150ce5d72        372MB
docker-router-advertiser      latest                   ab9150ce5d72        372MB
docker-dhcp-relay             202012.65487-3ef3e3c56   a881349d7cdc        386MB
docker-dhcp-relay             latest                   a881349d7cdc        386MB
docker-database               202012.65487-3ef3e3c56   13a115c1dcf0        372MB
docker-database               latest                   13a115c1dcf0        372MB
docker-mux                    202012.65487-3ef3e3c56   585cb1618e48        425MB
docker-mux                    latest                   585cb1618e48        425MB

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

zhangyanzhao commented 2 years ago

Adam will follow up with BRCM

srj102 commented 2 years ago

meta_sai_on_port_state_change_single needs to also handle SAI_OBJECT_TYPE_TUNNEL

switch (ot)
{
    // TODO hardcoded types, must advance SAI repository commit to get metadata for this
    case SAI_OBJECT_TYPE_PORT:
    case SAI_OBJECT_TYPE_BRIDGE_PORT:
    case SAI_OBJECT_TYPE_LAG:
    case SAI_OBJECT_TYPE_TUNNEL:

        valid = true;
        break;
JafarSeyedi commented 2 years ago

Hi, I have the same error (chipset : Trident 2) This is the log:

Mar 16 17:26:57.335089 sonic ERR syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_create_vpn:875 create tunnel initiator setup for net port failed with error Invalid parameter (0xfffffffc). Mar 16 17:26:57.335089 sonic ERR syncd#syncd: [none] SAI_API_TUNNEL:_brcm_sai_vxlan_enable:1167 create a vxlan decap tunnel failed with error -5. Mar 16 17:26:57.335089 sonic ERR syncd#syncd: [none] SAI_API_TUNNEL:brc

m_sai_create_tunnel_map_entry:2667 Can't create brcm vxlan tunnel Mar 16 17:26:57.335125 sonic ERR syncd#syncd: :- sendApiResponse: api SAI_COMMON_API_CREATE failed in syncd mode: SAI_STATUS_FAILURE Mar 16 17:26:57.335690 sonic ERR syncd#syncd: :- processQuadEvent: attr: SAI_TUNNEL_MAP_ENTRY_ATTR_TUNNEL_MAP_TYPE: SAI_TUNNEL_MAP_TYPE_VNI_TO_VLAN_ID Mar 16 17:26:57.335690 sonic ERR syncd#syncd: :- processQuadEvent: attr: SAI_TUNNEL_MAP_ENTRY_ATTR_TUNNEL_MAP: oid:0x29000000000beb Mar 16 17:26:57.335690 sonic ERR syncd#syncd: :- processQuadEvent: attr: SAI_TUNNEL_MAP_ENTRY_ATTR_VLAN_ID_VALUE: 30 Mar 16 17:26:57.335727 sonic ERR syncd#syncd: :- processQuadEvent: attr: SAI_TUNNEL_MAP_ENTRY_ATTR_VNI_ID_KEY: 3000 Mar 16 17:26:57.337023 sonic ERR swss#orchagent: :- create: create status: SAI_STATUS_FAILURE Mar 16 17:26:57.339149 sonic WARNING swss#orchagent: :- addOperation: Error adding tunnel map entry. Tunnel: vtep2. Entry: map_3000_Vlan30. Error: Can't create a tunnel map entry object

srj102 commented 2 years ago

EVPN VXLAN is supported only on Trident 3.

JafarSeyedi commented 2 years ago

@srj102, Thanks for your response. So what I must do with Trident 2 based switch? As I know Trident 2 supports VXLAN (VLAN to VNI at least), but it is not supported in SAI as you said.

  1. Is there any way to call Opennsl directly in Sonic?
  2. Is there any plan to support VXLAN in the SAI for Trident 2?
bradh352 commented 1 week ago

@srj102

meta_sai_on_port_state_change_single needs to also handle SAI_OBJECT_TYPE_TUNNEL

switch (ot)
{
    // TODO hardcoded types, must advance SAI repository commit to get metadata for this
    case SAI_OBJECT_TYPE_PORT:
    case SAI_OBJECT_TYPE_BRIDGE_PORT:
    case SAI_OBJECT_TYPE_LAG:
    case SAI_OBJECT_TYPE_TUNNEL:

        valid = true;
        break;

Ok, so as per https://github.com/sonic-net/sonic-sairedis/pull/1467 this is not the proper solution to the error message Aug 16 03:14:19.373668 Odin-ec58a ERR swss#orchagent: :- meta_sai_on_port_state_change_single: data.port_id oid:0x2a0000000009bf has unexpected type: SAI_OBJECT_TYPE_TUNNEL, expected PORT, BRIDGE_PORT or LAG . The replacement PR that was created also does not resolve the issue, they just made the metadata type check dynamic rather than depending on a static list. I should clarify this ERR is meaningless other than the fact that it causes confusion when VXLAN isn't working.

I'm adding some debug into sonic-swss/orchagent/portsarch to see how its presented there and see if we can keep it from propagating into the SAI Metadata as according to https://github.com/opencomputeproject/SAI/blob/master/inc/saiport.h#L137 the tunnel type should never be passed to that in the first place. That said, the person who rejected the PR said it could be a vendor issue, meaning broadcom in my case sending an event when it shouldn't. Not sure. I'll update after I debug a bit.