sonic-net / sonic-platform-daemons

Platform module daemons for SONiC
Other
25 stars 159 forks source link

Update for the procedures for insertion/hot swap of Switch Fabric Module (SFM) by using "config chassis modules shutdown/startup" commands #491

Closed JunhongMao closed 6 months ago

JunhongMao commented 6 months ago

Why I did it

For the Nokia SONiC chassis procedures for insertion/hot swap of Switch Fabric Module(SFM), the previous solution was using the below commands.

sudo nokia_cmd set shutdown-sfm <SFM-Num/Physical-Slot>

The below 4 PRs intend to add the below commands for the equivalent operations. https://github.com/sonic-net/sonic-platform-daemons/pull/491 https://github.com/sonic-net/sonic-utilities/pull/3283 https://github.com/nokia/sonic-platform/pull/6 https://github.com/sonic-net/sonic-buildimage/pull/18938

sudo config chassis modules shutdown/startup <module name>

The HLD for Shutdown and Startup of the Fabric Module is below: https://github.com/sonic-net/SONiC/pull/1694

The below PR was replaced. https://github.com/sonic-net/sonic-buildimage/pull/18578

Work item tracking

How I did it

  1. When the cli command "sudo config chassis modules startup/shutdown" runs, it directly calls config/fabric_module_set_admin_status.py to do the related operations.

    How to verify it

The below test was carried out on FABRIC-CARD3 module on the supervisor card.
1. Shutdown
sudo config chassis modules shutdown FABRIC-CARD3

2. Check the status to see if the FABRIC-CARD3 was down.
$ show chassis modules status
        Name             Description    Physical-Slot    Oper-Status    Admin-Status       Serial
------------  ----------------------  ---------------  -------------  --------------  -----------
...
FABRIC-CARD3             Unavailable                4          Empty            down          N/A

sudo tail -f /var/log/syslog | grep "pmon#chassisd:"
May  1 00:07:54.192037 ixre-cpm-chassis15 WARNING pmon#chassisd: Module FABRIC-CARD3 went off-line!
 ...

3. Start up the module
sudo config chassis modules startup FABRIC-CARD3

4. Check the status
$ show chassis modules status
        Name             Description    Physical-Slot    Oper-Status    Admin-Status       Serial
------------  ----------------------  ---------------  -------------  --------------  -----------
...
FABRIC-CARD3                    SFM4                4         Online              up  01214400362

sudo tail -f /var/log/syslog | grep "pmon#chassisd:"
May  1 00:26:29.501687 ixre-cpm-chassis15 NOTICE pmon#chassisd: Module FABRIC-CARD3 recovered on-line!

5. To test if the operation is still valid when the system reboot. For example, first shut down, 
then after saving config and reboot, the module should keep shutdown status. 
$ sudo config save
Existing files will be overwritten, continue? [y/N]: y

Then check the status to see if the FABRIC-CARD3 was down.
$ show chassis modules status
        Name             Description    Physical-Slot    Oper-Status    Admin-Status       Serial
------------  ----------------------  ---------------  -------------  --------------  -----------
...
FABRIC-CARD3             Unavailable                4          Empty            down          N/A

Which release branch to backport (provide reason below if selected)

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

JunhongMao commented 6 months ago

This PR is to replace the PR: https://github.com/sonic-net/sonic-platform-daemons/pull/475

Because we should raise the PR first in master -- get it merged and later cherry-pick to 202205 The modifications of the two PRs are same. The previous review comments can be found at https://github.com/sonic-net/sonic-platform-daemons/pull/475.

JunhongMao commented 6 months ago

@mlok-nokia , @judyjoseph , please review and approve it. Thanks.

judyjoseph commented 6 months ago

/azp run

azure-pipelines[bot] commented 6 months ago
Azure Pipelines successfully started running 1 pipeline(s).