sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
736 stars 1.42k forks source link

[Nokia] Restarting SWSS followed by issuing shut or no shut causes no change on admin status #12539

Closed mihirpat1 closed 2 years ago

mihirpat1 commented 2 years ago

Description

It seems that on Nokia IXR7250e, the admin status remains unchaged for any shut or no shut config done for a port after restarting SWSS docker container

Steps to reproduce the issue:

  1. Restart SWSS docker container (docker restart swss0)
  2. For a port which is in shut state, issue no shut or for a port in no shut state, issue shut. The admin status will remain unchanged while this test is performed.

Describe the results you received:

The admin status will remain unchanged while this test is performed.

Describe the results you expected:

The admin status should change to down if a shut command is issued after restarting SWSS or the admin status should change to up if a no shut command is issued after restarting SWSS

Output of show version:

https://github.com/sonic-net/sonic-buildimage/files/9883766/tech_support_nokia.txt

root@str2-7250-lc1-1:/home/admin# show version

SONiC Software Version: SONiC.20220531C.08C Distribution: Debian 11.5 Kernel: 5.10.0-12-2-amd64 Build commit: 0ba933f527 Build date: Fri Oct 21 01:52:07 UTC 2022 Built by: cloudtest@e722c088c000001

Platform: x86_64-nokia_ixr7250e_36x400g-r0 HwSKU: Nokia-IXR7250E-36x400G ASIC: broadcom ASIC Count: 2 Serial Number: 19688-0038 Model Number: 3HE12578AARA01 Hardware Revision: Unknown Uptime: 23:43:50 up 22:36, 2 users, load average: 0.78, 1.01, 1.28 Date: Thu 27 Oct 2022 23:43:50

Docker images: REPOSITORY TAG IMAGE ID SIZE docker-orchagent 20220531C.08C 406b46c6a94e 478MB docker-orchagent latest 406b46c6a94e 478MB docker-fpm-frr 20220531C.08C 1596b8441c2e 489MB docker-fpm-frr latest 1596b8441c2e 489MB docker-teamd 20220531C.08C d61e29100837 459MB docker-teamd latest d61e29100837 459MB docker-macsec latest ad49f0cb5fce 461MB docker-syncd-brcm-dnx 20220531C.08C 893d808ac86d 790MB docker-syncd-brcm-dnx latest 893d808ac86d 790MB docker-gbsyncd-broncos 20220531C.08C 456219fc9274 490MB docker-gbsyncd-broncos latest 456219fc9274 490MB docker-gbsyncd-credo 20220531C.08C 818665234d3f 461MB docker-gbsyncd-credo latest 818665234d3f 461MB docker-dhcp-relay latest 9cebb5b5a02d 453MB docker-snmp 20220531C.08C 0c329307c6ed 488MB docker-snmp latest 0c329307c6ed 488MB docker-platform-monitor 20220531C.08C 511a76d25817 565MB docker-platform-monitor latest 511a76d25817 565MB docker-sonic-telemetry 20220531C.08C d2e91f6f2f78 524MB docker-sonic-telemetry latest d2e91f6f2f78 524MB docker-router-advertiser 20220531C.08C e8d102a088bc 443MB docker-router-advertiser latest e8d102a088bc 443MB docker-mux 20220531C.08C ac8f64edbce9 492MB docker-mux latest ac8f64edbce9 492MB docker-lldp 20220531C.08C 22ac65399606 485MB docker-lldp latest 22ac65399606 485MB docker-database 20220531C.08C 683864b2a80e 443MB docker-database latest 683864b2a80e 443MB docker-acms 20220531C.08C 0411d1c65e5b 490MB docker-acms latest 0411d1c65e5b 490MB

root@str2-7250-lc1-1:/home/admin#


#### Output of `show techsupport`:

(paste your output here or download and attach the file here )


#### Additional information you deem important (e.g. issue happens only occasionally):

Overall, Ethernet10 was in admin up state and Ethernet11 was in admin down state before SWSS was restarted.

After restarting SWSS: After issuing shut on port Ethernet10, the admin status is still up.

Before issuing no shut root@str2-7250-lc1-1:/home/admin# show int statu | grep Ethernet10 Ethernet10 56,57,58,59,60,61,62,63 400G 9100 N/A Ethernet10 routed down up QSFP-DD Double Density 8X Pluggable Transceiver off root@str2-7250-lc1-1:/home/admin#

After issuing shut root@str2-7250-lc1-1:/home/admin# config interface -n asic0 sh Ethernet10
root@str2-7250-lc1-1:/home/admin# show int statu | grep Ethernet10 Ethernet10 56,57,58,59,60,61,62,63 400G 9100 N/A Ethernet10 routed down up QSFP-DD Double Density 8X Pluggable Transceiver off root@str2-7250-lc1-1:/home/admin#

PMON log: Oct 27 23:34:12.063009 str2-7250-lc1-1 WARNING pmon#sfp: $$$ Ethernet10 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Ethernet10', 'asic_port_name': 'Eth10-ASIC0', 'coreid': '0', 'coreportid': '11', 'description': 'Ethernet10', 'index': '11', 'lanes': '56,57,58,59,60,61,62,63', 'mtu': '9100', 'numvoq': '8', 'pfc_asym': 'off', 'role': 'Ext', 'speed': '400000', 'tpid': '0x8100', 'admin_status': 'down'} Oct 27 23:34:12.063030 str2-7250-lc1-1 WARNING pmon#sfp: *** Ethernet10CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Ethernet10', 'asic_port_name': 'Eth10-ASIC0', 'coreid': '0', 'coreportid': '11', 'description': 'Ethernet10', 'index': '11', 'lanes': '56,57,58,59,60,61,62,63', 'mtu': '9100', 'numvoq': '8', 'pfc_asym': 'off', 'role': 'Ext', 'speed': '400000', 'tpid': '0x8100', 'admin_status': 'down', 'key': 'Ethernet10', 'asic_id': 0, 'op': 'SET'} Oct 27 23:34:12.098573 str2-7250-lc1-1 NOTICE pmon#sfp: CMIS: Ethernet10: 400G, lanemask=0xff, state=INSERTED, retries=0 Oct 27 23:34:12.098573 str2-7250-lc1-1 NOTICE pmon#sfp: CMIS: Ethernet10 Forcing Tx laser OFF

After issuing no shut on a port which was already in shut state: Before issuing no shut root@str2-7250-lc1-1:/home/admin# show int statu | grep Ethernet11 Ethernet11 48,49,50,51,52,53,54,55 400G 9100 N/A Ethernet11 routed down down QSFP-DD Double Density 8X Pluggable Transceiver off After issuing no shut root@str2-7250-lc1-1:/home/admin# config interface -n asic0 st Ethernet11 root@str2-7250-lc1-1:/home/admin# show int statu | grep Ethernet11
Ethernet11 48,49,50,51,52,53,54,55 400G 9100 N/A Ethernet11 routed down down QSFP-DD Double Density 8X Pluggable Transceiver off root@str2-7250-lc1-1:/home/admin# show int statu | grep Ethernet11 Ethernet11 48,49,50,51,52,53,54,55 400G 9100 N/A Ethernet11 routed down down QSFP-DD Double Density 8X Pluggable Transceiver off root@str2-7250-lc1-1:/home/admin#

PMON log: Oct 27 23:38:42.830418 str2-7250-lc1-1 WARNING pmon#sfp: $$$ Ethernet11 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Ethernet11', 'asic_port_name': 'Eth11-ASIC0', 'coreid': '0', 'coreportid': '12', 'description': 'Ethernet11', 'index': '12', 'lanes': '48,49,50,51,52,53,54,55', 'mtu': '9100', 'numvoq': '8', 'pfc_asym': 'off', 'role': 'Ext', 'speed': '400000', 'tpid': '0x8100', 'admin_status': 'up'} Oct 27 23:38:42.830418 str2-7250-lc1-1 WARNING pmon#sfp: *** Ethernet11CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Ethernet11', 'asic_port_name': 'Eth11-ASIC0', 'coreid': '0', 'coreportid': '12', 'description': 'Ethernet11', 'index': '12', 'lanes': '48,49,50,51,52,53,54,55', 'mtu': '9100', 'numvoq': '8', 'pfc_asym': 'off', 'role': 'Ext', 'speed': '400000', 'tpid': '0x8100', 'admin_status': 'up', 'key': 'Ethernet11', 'asic_id': 0, 'op': 'SET'} Oct 27 23:38:42.862460 str2-7250-lc1-1 NOTICE pmon#sfp: CMIS: Ethernet11: 400G, lanemask=0xff, state=INSERTED, retries=0 Oct 27 23:38:42.862460 str2-7250-lc1-1 NOTICE pmon#sfp: CMIS: Ethernet11 Forcing Tx laser OFF

mihirpat1 commented 2 years ago

Closing this issue since restarting swss0 through "docker restart swss0" is not a recommended way. Also, verified that things are working if I use "systemctl restart swss@0.service" instead.