Open rlebedys opened 6 months ago
I am not able to open the log, please upload techsupport output
@adyeung, I am adding the logs to the comment.
Also attaching the techsupport dump archive that was taken when containers exited after subinterface creation. sonic_dump_61W5SR3-mgmt_20240313_090936.tar.gz
Problem is specific to DellEMC-S5248f-P-25G, it appears the community DELL td3-s5248f-25g.config.bcm is missing SOC parameter flow_init_mode = 1 for VFI MGID creation, besides that there are other parameters also needed for VLAN VFI to work in TD3 for sub intf creation.
Request DELL contributor @aravindmani-1 to help followup and update the file
Thanks for the update, I noticed the same issue on Accton-AS7326-56X
and Accton-AS7726-32X
, however, I don't have access to them anymore, and I can't collect any specific information.
@adyeung @aravindmani-1 is this fix going to get merged to the master?
@adyeung @aravindmani-1 is this fix going to get merged to the master? Yes. This will be merged into master branch. @prgeor Could you please help to merge this PR https://github.com/sonic-net/sonic-buildimage/pull/18505 ?.
The same happens with accton_as7326_56x
switches. Are there any updates regarding Accton platform?
SONiC Software Version: SONiC.202405.0-dirty-20240620.233504
SONiC OS Version: 12
Distribution: Debian 12.5
Kernel: 6.1.0-11-2-amd64
Build commit: 926d03322
Build date: Thu Jun 20 22:58:12 UTC 2024
Platform: x86_64-accton_as7326_56x-r0
HwSKU: Accton-AS7326-56X
ASIC: broadcom
ASIC Count: 1
Hey @adyeung.
perhaps you had a chance to take a look at accton_as7326_56x
switches, seems like they are facing the same issue as those Dell's.
@jostar-yang please help update the config.bcm files from Accton
@jostar-yang have you had the opportunity to review this issue?
@jostar-yang Hello, any update regarding this?
@aravindmani-1 any news about this?
@rlebedys did you test @aravindmani-1 fix, does it work for you? I've just tested it with s5248f and still, as soon as I add subinterface containers start to crash.
The error:
2024 Aug 30 08:09:11.250333 leaf1 ERR syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:_brcm_sai_xgs_create_router_interface_common_config:3205 L3 intf create failed with error -2.
2024 Aug 30 08:09:11.250333 leaf1 ERR syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:_brcm_sai_xgs_create_router_interface_common_config:3278 RIF common config create failed rv:-2
2024 Aug 30 08:09:11.250333 leaf1 ERR syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:_brcm_sai_xgs_create_sub_port_router_interface:3947 Sub-Port RIF common Config failed with error -2.
2024 Aug 30 08:09:11.250333 leaf1 ERR syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:_brcm_sai_xgs_create_sub_port_router_interface:4001 SubPort Router Interface Create Failed for port:49 lag:no vlan:666 vpnid:32768 vp:0xb0000001 vfp_entry_id:0 l3_intf_id:0 rv:-2
2024 Aug 30 08:09:11.250333 leaf1 INFO syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:_brcm_sai_sub_router_intf_l2_unconfig:1964 destroy vlan
2024 Aug 30 08:09:11.250537 leaf1 ERR syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:brcm_sai_xgs_create_router_interface:5176 Error in create router interface failed with error -2.
2024 Aug 30 08:09:11.250590 leaf1 ERR syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:brcm_sai_create_router_interface:493 pd router intf create failed with error -2.
2024 Aug 30 08:09:11.250862 leaf1 ERR syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:brcm_sai_create_router_interface:522 Router Interface Create Failed rv:-2
2024 Aug 30 08:09:11.250909 leaf1 ERR syncd#syncd: [none] SAI_API_ROUTER_INTERFACE:brcm_sai_router_interface_create_err_cleanup:7140 RIF Create failed: rif_id:0 type:4 vrf:0 port-lag-id:49 lag:no vlan:666 virtual:no
2024 Aug 30 08:09:11.250958 leaf1 ERR syncd#syncd: :- sendApiResponse: api SAI_COMMON_API_CREATE failed in syncd mode: SAI_STATUS_NOT_SUPPORTED
SONiC Software Version: SONiC.202405.0-dirty-20240830.091822
SONiC OS Version: 12
Distribution: Debian 12.6
Kernel: 6.1.0-11-2-amd64
Build commit: 249c20bdf
Build date: Fri Aug 30 06:58:39 UTC 2024
Platform: x86_64-dellemc_s5248f_c3538-r0
HwSKU: DellEMC-S5248f-P-25G
ASIC: broadcom
ASIC Count: 1
Hardware Revision: N/A
Uptime: 08:06:06 up 12 min, 1 user, load average: 2.89, 2.14, 1.36
Date: Fri 30 Aug 2024 08:06:06
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-dhcp-relay latest 934cdc88b25f 324MB
docker-dhcp-server latest cdf709f8a11d 338MB
docker-fpm-frr 202405.0-dirty-20240830.091822 1b46e9d04a15 375MB
docker-fpm-frr latest 1b46e9d04a15 375MB
docker-macsec latest 3a77d124e235 346MB
docker-lldp 202405.0-dirty-20240830.091822 7904ddbfb954 360MB
docker-lldp latest 7904ddbfb954 360MB
docker-mux 202405.0-dirty-20240830.091822 f355662bc7b3 366MB
docker-mux latest f355662bc7b3 366MB
docker-snmp 202405.0-dirty-20240830.091822 81a0b637c93e 354MB
docker-snmp latest 81a0b637c93e 354MB
docker-sonic-gnmi 202405.0-dirty-20240830.091822 e4a31bbc8cd4 399MB
docker-sonic-gnmi latest e4a31bbc8cd4 399MB
docker-sonic-mgmt-framework 202405.0-dirty-20240830.091822 6d5e68ff3033 401MB
docker-sonic-mgmt-framework latest 6d5e68ff3033 401MB
docker-teamd 202405.0-dirty-20240830.091822 3f52352e3264 343MB
docker-teamd latest 3f52352e3264 343MB
docker-platform-monitor 202405.0-dirty-20240830.091822 c3c08f5f6d41 440MB
docker-platform-monitor latest c3c08f5f6d41 440MB
docker-sflow 202405.0-dirty-20240830.091822 9da2da5cca1c 344MB
docker-sflow latest 9da2da5cca1c 344MB
docker-router-advertiser 202405.0-dirty-20240830.091822 c932382d33d1 315MB
docker-router-advertiser latest c932382d33d1 315MB
docker-orchagent 202405.0-dirty-20240830.091822 31fa919519aa 356MB
docker-orchagent latest 31fa919519aa 356MB
docker-nat 202405.0-dirty-20240830.091822 85a7be8ce26d 346MB
docker-nat latest 85a7be8ce26d 346MB
docker-iccpd 202405.0-dirty-20240830.091822 d44f59428033 344MB
docker-iccpd latest d44f59428033 344MB
docker-database 202405.0-dirty-20240830.091822 59cefa77b041 323MB
docker-database latest 59cefa77b041 323MB
docker-eventd 202405.0-dirty-20240830.091822 bbe4d9b78786 314MB
docker-eventd latest bbe4d9b78786 314MB
docker-syncd-brcm 202405.0-dirty-20240830.091822 3f34d16e8e42 717MB
docker-syncd-brcm latest 3f34d16e8e42 717MB
docker-gbsyncd-broncos 202405.0-dirty-20240830.091822 6ac692db5646 354MB
docker-gbsyncd-broncos latest 6ac692db5646 354MB
docker-gbsyncd-credo 202405.0-dirty-20240830.091822 3f63e3eb401e 327MB
docker-gbsyncd-credo latest 3f63e3eb401e 327MB
the fix is applied:
# cat /usr/share/sonic/device/x86_64-dellemc_s5248f_c3538-r0/DellEMC-S5248f-P-25G/td3-s5248f-25g.config.bcm
...
mem_cache_enable=0
lpm_scaling_enable=0
bcm_num_cos=10
default_cpu_tx_queue=9
host_as_route_disable=1
sai_eapp_config_file=/etc/broadcom/eapps_cfg.json
sai_fast_convergence_support=1
flow_init_mode=1
sai_load_hw_config=/usr/lib/cancun/
...
@tomvil could you please share the complete steps that you tried?. Did you tried restarting the switch after applying the NPU configs?.. From the logs shared, SAI API unsupported messages are seen.
@aravindmani-1 I have built the image (202405 branch) with your commit from https://github.com/sonic-net/sonic-buildimage/pull/18505 pull request. I see the configuration is present in td3-s5248f-25g.config.bcm
. And yes, I have tried to restart it.
Is there anything else I can check for you?
SAI version on my switch:
# bcmcmd "bcmsai ver"
bcmsai ver
BRCM SAI ver: [10.1.37.0], OCP SAI ver: [1.13.2], SDK ver: [sdk-6.5.29], CANCUN ver: [06.04.01]
could you share the complete steps that you tried to recreate the issue(starting from the commands used)?.
@aravindmani-1 here's how I reproduce the issue every time:
config subinterface add Ethernet0.666 666
@tomvil can you upload the "show techsupport" logs(when you hit the issue, please collect logs since one hour using techsupport options)?.
Description
When creating a subinterface on Broadcom-based switches (Trident 3) it causes multiple containers to exit.
Steps to reproduce the issue:
config subinterface add EthernetXX.20 20
Describe the results you received:
Multiple containers (swss, syncd and others) exit and switch becomes unstable. Containers are in a crash loop.
Describe the results you expected:
Created subinterface on port EthernetXX.
Output of
show version
:Additional information you deem important (e.g. issue happens only occasionally):
Broadcom SAI version:
Attaching logs right after execution of
config subinterface add
command. subinterface_add_logs.txt