sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
711 stars 1.36k forks source link

[202012] [Dell S5248F] Platform modules missing for DellEMC-S5248f #10887

Closed jeff-yin closed 2 years ago

jeff-yin commented 2 years ago

Opening this based on query from mailing list: https://groups.google.com/g/sonicproject/c/Yv6R-_ubBzg/m/GOf6gfgODAAJ?utm_medium=email&utm_source=footer&pli=1

Description

pmon fails to start due traceback thrown in decode-syseeprom, which leads to platform module not loading correctly.

Steps to reproduce the issue:

Install the 202012 sonic-broadcom image to the Dell S5248F-ON.

Describe the results you received:

show platform syseeprom
Traceback (most recent call last):
  File "/usr/local/bin/decode-syseeprom", line 171, in <module>
    exit(main())
  File "/usr/local/bin/decode-syseeprom", line 58, in main
    run(t, opts, args, support_eeprom_db)
  File "/usr/local/bin/decode-syseeprom", line 87, in run
    err = target.read_eeprom_db()
  File "/usr/local/lib/python3.7/dist-packages/sonic_eeprom/eeprom_tlvinfo.py", line 283, in read_eeprom_db
    db_state = self._redis_hget('EEPROM_INFO|State', 'Initialized')
  File "/usr/local/lib/python3.7/dist-packages/sonic_eeprom/eeprom_tlvinfo.py", line 630, in _redis_hget
    value = self.redis_client.hget(key, field)
  File "/usr/local/lib/python3.7/dist-packages/sonic_eeprom/eeprom_tlvinfo.py", line 625, in redis_client
    if not self._redis_client:
AttributeError: 'board' object has no attribute '_redis_client'

PMON docker details:

docker exec -it pmon bash
root@sonic:/usr/local/bin# python3 /usr/local/bin/thermalctld
Traceback (most recent call last):
  File "/usr/local/bin/thermalctld", line 14, in <module>
    import sonic_platform
ImportError: No module named sonic_platform

Describe the results you expected:

pmon should start successfully and there should not be any traceback errors in show platform syseeprom

Output of show version:

SONiC Version:

SONiC Software Version: SONiC.202012.100954-acfee3be9
Distribution: Debian 10.12
Kernel: 4.19.0-12-2-amd64
Build commit: acfee3be9
Build date: Thu May 19 13:38:29 UTC 2022
Built by: AzDevOps@sonic-build-workers-001IJN

Platform: x86_64-dellemc_s5248f_c3538-r0
HwSKU: DellEMC-S5248f-P-25G
ASIC: broadcom
ASIC Count: 1
Serial Number: 
Uptime: 12:32:27 up 22 min,  1 user,  load average: 0.16, 0.18, 0.18

Docker images:
REPOSITORY                    TAG                       IMAGE ID            SIZE
docker-sonic-mgmt-framework   202012.100954-acfee3be9   3043b6de019e        687MB
docker-sonic-mgmt-framework   latest                    3043b6de019e        687MB
docker-sonic-telemetry        202012.100954-acfee3be9   d26dc75a6ef3        451MB
docker-sonic-telemetry        latest                    d26dc75a6ef3        451MB
docker-fpm-frr                202012.100954-acfee3be9   b9ae3f4cea2e        391MB
docker-fpm-frr                latest                    b9ae3f4cea2e        391MB
docker-sflow                  202012.100954-acfee3be9   27e297133b7a        374MB
docker-sflow                  latest                    27e297133b7a        374MB
docker-nat                    202012.100954-acfee3be9   60b4fad032f2        376MB
docker-nat                    latest                    60b4fad032f2        376MB
docker-teamd                  202012.100954-acfee3be9   dfd090573019        373MB
docker-teamd                  latest                    dfd090573019        373MB
docker-orchagent              202012.100954-acfee3be9   cd5c3153656d        390MB
docker-orchagent              latest                    cd5c3153656d        390MB
docker-platform-monitor       202012.100954-acfee3be9   1d9f6c11e627        544MB
docker-platform-monitor       latest                    1d9f6c11e627        544MB
docker-snmp                   202012.100954-acfee3be9   d0fd0c2b84b6        405MB
docker-snmp                   latest                    d0fd0c2b84b6        405MB
docker-syncd-brcm             202012.100954-acfee3be9   b9c85ebbe4a1        654MB
docker-syncd-brcm             latest                    b9c85ebbe4a1        654MB
docker-router-advertiser      202012.100954-acfee3be9   af94a56e0148        362MB
docker-router-advertiser      latest                    af94a56e0148        362MB
docker-lldp                   202012.100954-acfee3be9   951ec937bab9        402MB
docker-lldp                   latest                    951ec937bab9        402MB
docker-dhcp-relay             202012.100954-acfee3be9   7cedcd5b5c5f        375MB
docker-dhcp-relay             latest                    7cedcd5b5c5f        375MB
docker-database               202012.100954-acfee3be9   580fd16d5bb6        362MB
docker-database               latest                    580fd16d5bb6        362MB
docker-mux                    202012.100954-acfee3be9   8cdc6f4f43c3        414MB
docker-mux                    latest                    8cdc6f4f43c3        414MB

Output of show techsupport:

Not provided, but if the repro is otherwise not straightforward, let's reach out to Madhu Paluru @ Aviz.

Additional information you deem important (e.g. issue happens only occasionally):

More logs provided by Madhu:

May 19 12:09:43 sonic rc.local[433]: + dpkg -i /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb
May 19 12:09:43 sonic rc.local[433]: + dpkg -i /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb
May 19 12:09:43 sonic rc.local[555]: dpkg-deb: error: '/host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb' is not a Debian format archive
May 19 12:09:43 sonic rc.local[555]: dpkg-deb: error: '/host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb' is not a Debian format archive
May 19 12:09:43 sonic rc.local[550]: dpkg: error processing archive /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb (--install):
May 19 12:09:43 sonic rc.local[550]: dpkg: error processing archive /host/image-202012.100954-acfee3be9/platform/x86_64-dellemc_s5248f_c3538-r0/platform-modules-s5248f_1.1_amd64.deb (--install):
jeff-yin commented 2 years ago

Please assign to @aravindmani-1 for initial triage

madhupalu commented 2 years ago

@jeff-yin

Thanks for tracking the issue - https://github.com/Azure/sonic-buildimage/issues/10887

May I know who will be working on this? Note: - We see there is a fix made to reduce image size - https://github.com/Azure/sonic-buildimage/pull/10775/files and the issue got addressed in 202211 and there is a back port requested for 202012. Could someone from Dell or BRCM check whether this can fix this issue?

KrupakarAnnam commented 2 years ago

sonic_dump_sonic_20220522_234126.tar.gz Hi Team, Attaching 'show techsupport' dump here...

abdosi commented 2 years ago

@jeff-yin is the installation of new image done via sonic installer or onie ?

jeff-yin commented 2 years ago

@jeff-yin is the installation of new image done via sonic installer or onie ?

@madhupalu @KrupakarAnnam can you respond? I'm assuming it's via ONIE.

abdosi commented 2 years ago

cc @anamehra

KrupakarAnnam commented 2 years ago

HI Jeff, We did some more tests..

The issue is happening on the 202012 branch with ONIE installation only.

Reverting the PR solved the issue

Install the 202012 22nd May build using ONIE and the Debian file loading issue reproduced. Now, install a build by reverting PR: https://github.com/Azure/sonic-buildimage/pull/10775/files [github.com]) using ONIE and the issue is not seen anymore.

aravindmani-1 commented 2 years ago

Thanks @KrupakarAnnam . We observed the same issue in DellEMC S5232f also. i believe that we will be hitting this issue in most of the platforms. @abdosi , @xumia Could you please share your insights on this issue?.

abdosi commented 2 years ago

@xumia can we revert https://github.com/Azure/sonic-buildimage/pull/10775/files until you have complete tested fix ?

aravindmani-1 commented 2 years ago

@KrupakarAnnam Please confirm whether this issue can be closed.