SovereignCloudStack / issues

This repository is used for issues that are cross-repository or not bound to a specific repository.
https://github.com/orgs/SovereignCloudStack/projects/6
2 stars 1 forks source link

Test the up-to-date version of SONiC community image on the SCS Landscape network infrastructure #706

Closed matofeder closed 4 weeks ago

matofeder commented 3 months ago

According to the following documents

the SONiC community image in version SONiC-OS-202305.0-dirty-20231102.180401 failed to show system-health summary ( tested on Accton-AS7326-56X a.k.a Edgecore 7326-56X-O-AC-B):

$ sudo show system-health summary 
chassis.set_status_led is not implemented
...
AttributeError: 'Chassis' object has no attribute 'initizalize_system_led'

This issue aims to test whether the most recent SONiC community build failed too.

matofeder commented 3 months ago

Test with the most recent build of the SONiC community image for the Broadcom platform: sonic-broadcom-202405-619116 unfortunately failed on the same error.

$ show version 

SONiC Software Version: SONiC.202405.619116-dad1a1d90
SONiC OS Version: 12
Distribution: Debian 12.5
Kernel: 6.1.0-11-2-amd64
Build commit: dad1a1d90
Build date: Wed Aug 14 13:26:30 UTC 2024
Built by: azureuser@4a514b7ac000006

Platform: x86_64-accton_as7326_56x-r0
HwSKU: Accton-AS7326-56X
ASIC: broadcom
ASIC Count: 1
Serial Number: 732656X2317026
Model Number: FP4ZZ7656009A
Hardware Revision: N/A
Uptime: 08:49:55 up 18:01,  1 user,  load average: 0.64, 0.69, 0.74
Date: Thu 15 Aug 2024 08:49:55
sudo show system-health summary
chassis.set_status_led is not implemented
Traceback (most recent call last):
  File "/usr/local/bin/show", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/show/system_health.py", line 113, in summary
    _, chassis, stat = get_system_health_status()
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/show/system_health.py", line 32, in get_system_health_status
    chassis.initizalize_system_led()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Chassis' object has no attribute 'initizalize_system_led'

Therefore it seems, that the issue has not been addressed yet upstream. IMO It is not crucial from the main functionality of SONiC OS point of view, but indeed sub-optimal.

Upstream contains some nice example on another platform on how to fix the issue: DellEMC: S5232F support for show system-health command

In addition, the VP04 team resolved the issue on Accton-AS7326-56X with this patch https://github.com/SovereignCloudStack/sonic-buildimage/pull/1

Therefore, it would be great to cleanup the above PR and publish it upstream.

matofeder commented 1 month ago

It seems that the latest master branch of SONiC works better, therefore I re-opened this issue and will investigate how the latest master works with Accton devices of SCS.

matofeder commented 1 month ago

Overall the latest master branch version works better than the latest stable version 202405 of SONiC community. It seems that the patches are available but the community SONiC merging process is too slow (more than 800 open PRs).

It seems that the EdgeCore enterprise SONiC contains commits like this https://github.com/edge-core/sonic-buildimage/commit/4be14f0ac451d5d7e208e374376ddd7ae935a551, where the EdgeCore enterprise SONiC ports lot of open patches from the community version of SONiC.

matofeder commented 1 month ago

The next step could be to build our own SCS SONiC community image with the following patches/features:

matofeder commented 1 month ago

Overall fixed SCS SONiC image (based on the latest community master version) works much better than the vanilla latest community master version of SONiC. Some issues still occur for 1g and 10g SWs, but again, it seems that patches are already upstream but have not been merged yet :/. The next step is to port the above PRs into the SCS SONiC build branch https://github.com/SovereignCloudStack/sonic-buildimage/pull/4 and test it.

matofeder commented 1 month ago

Overall fixed SCS SONiC image (based on the latest community master version) works better than the vanilla latest community master version of SONiC. One issue (system-health monitor related) still occurs for 1g SW, but it is not considered blocking.