sonic-net / sonic-mgmt

Configuration management examples for SONiC
Other
200 stars 732 forks source link

Fix dualtor/test_switchover_failure.py #15642

Open vivekverma-arista opened 1 week ago

vivekverma-arista commented 1 week ago

Description of PR

Summary: Fixes #328

Type of change

Back port request

Approach

What is the motivation for this PR?

dualtor/test_switchover_failure.py failure for active-active dualtor.

        if (res.is_failed or 'exception' in res) and not module_ignore_errors:
>           raise RunAnsibleModuleFail("run module {} failed".format(self.module_name), res)
E           tests.common.errors.RunAnsibleModuleFail: run module shell failed, Ansible Results =>
E           failed = True
E           changed = True
E           rc = 1
E           cmd = config mux mode standby Ethernet0
E           start = 2024-11-16 00:39:44.410652
E           end = 2024-11-16 00:39:45.124954
E           delta = 0:00:00.714302
E           msg = non-zero return code
E           invocation = {'module_args': {'_raw_params': 'config mux mode standby Ethernet0', '_uses_shell': True, 'warn': False, 'stdin_add_newline': True, 'strip_empty_ends': True, 'argv': None, 'chdir': None, 'executable': None, 'creates': None, 'removes': None, 'stdin': None}}
E           _ansible_no_log = None
E           stdout =
E           this is not a valid port present on mux_cablestderr =

complex_args = {}
filename   = '/data/sonic-mgmt/tests/common/devices/multi_asic.py'

We have root caused it to the active-standby testcase that gets skipped. Even if the test gets skipped this condition hits in teradown: https://github.com/sonic-net/sonic-mgmt/blob/master/tests/dualtor/test_switchover_failure.py#L140-L146

Due to swss/mux restart the proceeding testcase fails with the above signature.

How did you do it?

Proposed fix is to skip restart of swss and mux if the testcase was skipped.

How did you verify/test it?

Stressed dualtor/test_switchover_failure.py with dualtor-aa-56 topology on Arista-7260CX3

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

StormLiangMS commented 5 days ago

hi @vivekverma-arista could we use condition mark to skip this test case?

vivekverma-arista commented 5 days ago

hi @vivekverma-arista could we use condition mark to skip this test case?

This PR is not skipping the test and the changes in this PR are needed regardless of the way we choose to skip the test. The setup of the current test needs to know if the previous test passed, skipped or failed. Without these changes the test setup will treat skip as failed, which leads to restart of swss and messes up the next test. We don't want swss to restart if the previous test was skipped.