sonic-net / sonic-mgmt

Configuration management examples for SONiC
Other
201 stars 727 forks source link

Several platform tests related to PSU failed due to various issues on Celestica DX010 platform #6518

Open assrinivasan opened 2 years ago

assrinivasan commented 2 years ago

Description

Several PSU-related test cases failed, please see the error messages below for each testcase. It's a known Celestica issue, file an issue here to xfail these test cases for now. Will remove xfail when this issue is resolved.

Steps to reproduce the issue:

  1. Run any PSU related test

API Tests platform_tests/api/test_psu.py::TestPsuApi::test_get_presence platform_tests/api/test_psu.py::TestPsuApi::test_get_status platform_tests/api/test_psu.py::TestPsuApi::test_power platform_tests/api/test_psu_fans.py::TestPsuFans::test_get_presence platform_tests/api/test_psu_fans.py::TestPsuFans::test_set_fans_led

SNMP Tests snmp/test_snmp_psu.py::test_snmp_numpsu snmp/test_snmp_psu.py::test_snmp_psu_status

Other Tests platform_tests/api/test_thermal.py::test_set_low_threshold platform_tests/daemon/test_psud.py::test_pmon_psud_kill_and_start_status

Describe the results you received:

API tests failure message: "BadStatusLine: No status line received - the server has closed the connection" SNMP tests failure message: "KeyError: 'snmp_psu' " Other tests failures: test_set_low_threshold: "Failed: Failed to set set_low_threshold for thermal 0 to 20" test_pmon_psud_kill_and_start_status: "Failed: psud expected restarted status is RUNNING but is FATAL"

Describe the results you expected

API tests: Need to investigate if these APIs are supported by Celestica for this platform SNMP tests: Add 'smp_psu' OID to the MIB test_set_low_threshold: Need to update platform.json file for dx010 according to platform.json enhancements design doc test_pmon_psud_kill_and_start_status: Need to investigate why PSUD exits and fix

Output of show version

admin@str2-dx010-acs-7:~$ show version

SONiC Software Version: SONiC.20220531.07 Distribution: Debian 11.5 Kernel: 5.10.0-12-2-amd64 Build commit: 29a7910a68 Build date: Thu Oct 6 17:44:37 UTC 2022 Built by: cloudtest@25710bd4c000006

Platform: x86_64-cel_seastone-r0 HwSKU: Celestica-DX010-C32 ASIC: broadcom ASIC Count: 1 Serial Number: N/A Model Number: N/A Hardware Revision: N/A Uptime: 19:24:40 up 1 min, 1 user, load average: 3.80, 1.33, 0.48 Date: Tue 11 Oct 2022 19:24:40

Docker images: REPOSITORY TAG IMAGE ID SIZE docker-mux 20220531.07 bd21169ee652 492MB docker-mux latest bd21169ee652 492MB docker-macsec latest f082929491e3 462MB docker-acms 20220531.07 26533d1868f5 490MB docker-acms latest 26533d1868f5 490MB docker-orchagent 20220531.07 85ef834189ea 478MB docker-orchagent latest 85ef834189ea 478MB docker-fpm-frr 20220531.07 5cb48dba6188 489MB docker-fpm-frr latest 5cb48dba6188 489MB docker-teamd 20220531.07 388e53a9ef9f 459MB docker-teamd latest 388e53a9ef9f 459MB docker-syncd-brcm 20220531.07 3bc7be18e509 785MB docker-syncd-brcm latest 3bc7be18e509 785MB docker-gbsyncd-broncos 20220531.07 6d92e3d56fc8 490MB docker-gbsyncd-broncos latest 6d92e3d56fc8 490MB docker-gbsyncd-credo 20220531.07 f488f6215414 461MB docker-gbsyncd-credo latest f488f6215414 461MB docker-dhcp-relay latest 5020cf6bc8b4 452MB docker-snmp 20220531.07 eefaa6a60529 488MB docker-snmp latest eefaa6a60529 488MB docker-sonic-telemetry 20220531.07 ceeeac38ff72 523MB docker-sonic-telemetry latest ceeeac38ff72 523MB docker-router-advertiser 20220531.07 33b77d46f76a 443MB docker-router-advertiser latest 33b77d46f76a 443MB docker-platform-monitor 20220531.07 c80262dd197d 565MB docker-platform-monitor latest c80262dd197d 565MB docker-lldp 20220531.07 a5f35337f892 485MB docker-lldp latest a5f35337f892 485MB docker-database 20220531.07 8d16e3cc3804 443MB docker-database latest 8d16e3cc3804 443MB

Output of sudo generate_dump

Lock succesfully accquired and installed signal handlers pcilib: sysfs_read_vpd: read failed: Input/output error pcilib: sysfs_read_vpd: read failed: Input/output error can't get debug descriptor: Resource temporarily unavailable can't get debug descriptor: Resource temporarily unavailable can't get device qualifier: Resource temporarily unavailable can't get debug descriptor: Resource temporarily unavailable /usr/local/bin/generate_dump: line 520: $2: unbound variable ERR: RC:-1 observed on line 520 conntrack v1.4.6 (conntrack-tools): 0 flow entries have been shown. conntrack v1.4.6 (conntrack-tools): 0 flow entries have been shown. conntrack v1.4.6 (conntrack-tools): 2123 flow entries have been shown. conntrack v1.4.6 (conntrack-tools): 2123 flow entries have been shown. bfdd is not running bfdd is not running bfdd is not running bfdd is not running rm: cannot remove '/var/dump/sonic_dump_str2-dx010-acs-7_20221011_192513/etc/systemd/user/sockets.target.wants/gpg-agent-extra.socket': No such file or directory rm: cannot remove '/var/dump/sonic_dump_str2-dx010-acs-7_20221011_192513/etc/systemd/user/sockets.target.wants/gpg-agent-browser.socket': No such file or directory rm: cannot remove '/var/dump/sonic_dump_str2-dx010-acs-7_20221011_192513/etc/systemd/user/sockets.target.wants/dirmngr.socket': No such file or directory rm: cannot remove '/var/dump/sonic_dump_str2-dx010-acs-7_20221011_192513/etc/systemd/user/sockets.target.wants/gpg-agent-ssh.socket': No such file or directory rm: cannot remove '/var/dump/sonic_dump_str2-dx010-acs-7_20221011_192513/etc/systemd/user/sockets.target.wants/gpg-agent.socket': No such file or directory ERR: RC:-1 observed on line 1397 Remove secret from etc files. sed: can't read /var/dump/sonic_dump_str2-dx010-acs-7_20221011_192513/etc/pam_radius_auth.d/*.conf: No such file or directory Tar append operation failed. Aborting for safety. Cleaning up working directory /var/dump/sonic_dump_str2-dx010-acs-7_20221011_192513 Removing lock. Exit: 0 /var/dump/sonic_dump_str2-dx010-acs-7_20221011_192513.tar.gz

qnos commented 2 years ago

DX010 platform PSU/FAN/temperature issues are fixed in PR#12567.