sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
733 stars 1.4k forks source link

[platform_tests/test_platform_info.py] - Fails due to AssertionError in T0 Physical Topology #15540

Open Vickyni2 opened 1 year ago

Vickyni2 commented 1 year ago

Description

The test case platform_tests/test_platform_info.py fails due to Assertion Error in T0 topology with a physical DUT

Steps to reproduce the issue: 1- .Run the test case in a physical DUT. 2. 3.

Describe the results you received:

def restart_thermal_control_daemon(dut):
    """
    Restart thermal control daemon by killing it and waiting supervisord to restart
    it automatically.
    :param dut: DUT object representing a SONiC switch under test.
    :return:
    """
    if dut.is_multi_asic and dut.sonic_release in ["201911"]:
        logging.info("thermalctl daemon is not present")
        return
    logging.info(
        'Restarting thermal control daemon on {}...'.format(dut.hostname))
    find_thermalctld_pid_cmd = 'docker exec -i pmon bash -c \'pgrep -f thermalctld\' | sort'
    output = dut.shell(find_thermalctld_pid_cmd)
    assert output["rc"] == 0, "Run command '%s' failed" % find_thermalctld_pid_cmd
    # Usually there should be 2 thermalctld processes, but there is chance that
    # sonic platform API might use subprocess which creates extra thermalctld process.
    # For example, chassis.get_all_sfps will call sfp constructor, and sfp constructor may
    # use subprocess to call ethtool to do initialization.
    # So we check here thermalcltd must have at least 2 processes.
    # For mellanox, it has at least two processes, but for celestica(broadcom),
    # it only has one thermalctld process
    if dut.facts["asic_type"] == "mellanox":
        assert len(output["stdout_lines"]
                   ) >= 2, "There should be at least 2 thermalctld process"
    else:
        assert len(output["stdout_lines"]
                 ) >= 1, "There should be at least 1 thermalctld process"

E AssertionError: There should be at least 1 thermalctld process

dut = find_thermalctld_pid_cmd = "docker exec -i pmon bash -c 'pgrep -f thermalctld' | sort" output = {'stderr_lines': [], u'cmd': u"docker exec -i pmon bash -c 'pgrep -f thermalct...tdin_add_newline': True, u'stdin': None}}, 'stdout_lines': [], 'failed': False}

platform_tests/thermal_control_test_helper.py:299: AssertionError

Describe the results you expected: The test case should pass.

Additional information you deem important:

**Output of `show version`:**

SONiC Software Version: SONiC.202211.269499-59c7d39ef SONiC OS Version: 11 Distribution: Debian 11.6 Kernel: 5.10.0-18-2-amd64 Build commit: 59c7d39ef Build date: Tue May 9 17:58:15 UTC 2023 Built by: AzDevOps@vmss-soni00118K

Platform: x86_64-accton_as7716_32x-r0 HwSKU: Accton-AS7716-32X ASIC: broadcom ASIC Count: 1 Serial Number: N/A Model Number: N/A Hardware Revision: N/A Uptime: 20:09:00 up 7 min, 1 user, load average: 0.85, 0.48, 0.22 Date: Sun 07 Aug 2022 20:09:00

**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
bingwang-ms commented 1 year ago

The stdout is empty, which indicates thermalctld is not running. Need to figure out the reason why thermalctld exited.

output = {'stderr_lines': [], u'cmd': u"docker exec -i pmon bash -c 'pgrep -f thermalct...tdin_add_newline': True, u'stdin': None}}, 'stdout_lines': [], 'failed': False}
yxieca commented 1 year ago

@Vickyni2 has accton completed the platform API development?

@prgeor can you advise if this issue should be a buildimae repo issue?

prgeor commented 1 year ago

@yxieca looks to be platform specific issue. not generic issue.

prgeor commented 1 year ago

@Vickyni2 can you work with Acton to implement the platform API