sonic-net / sonic-mgmt

Configuration management examples for SONiC
Other
194 stars 714 forks source link

[Logrotate] test case test_orchagent_logrotate could be affected by the config reload in the teardown of the previous test #12612

Closed congh-nvidia closed 2 months ago

congh-nvidia commented 5 months ago

Description

Steps to reproduce the issue:

  1. Run the test tests/syslog/test_logrotate.py

Describe the results you received: The first run of test test_orchagent_logrotate failed:

Failed: Found pending entries in APPL_DB: ['_NEIGH_TABLE:Ethernet64:10.20.30.40']
orch_logrotate_setup = (['_ROUTE_TABLE:192.194.184.128/25', '_ROUTE_TABLE:193.58.224.128/25', '_ROUTE_TABLE:192.171.132.0/25', '_ROUTE_TABLE:193.34.108.128/25', '_ROUTE_TABLE:192.240.148.128/25', '_ROUTE_TABLE:192.180.138.0/25', ...], 'Ethernet64')
rand_selected_dut = <MultiAsicSonicHost r-leopard-79>

    @pytest.mark.repeat(5)
    def test_orchagent_logrotate(orch_logrotate_setup, rand_selected_dut):
        """
        Tests for the issue where an orchagent logrotate can cause a missed APPL_DB notification
        """
        ignore_entries, target_port = orch_logrotate_setup
        rand_selected_dut.control_process('orchagent', pause=True)
        rand_selected_dut.control_process('orchagent', signal='SIGHUP')
        rand_selected_dut.shell('sudo ip neigh add {} lladdr {} dev {}'.format(FAKE_IP, FAKE_MAC, target_port))
        rand_selected_dut.control_process('orchagent', pause=False)
        pending_entries = get_pending_entries(rand_selected_dut, ignore_list=ignore_entries)
>       pytest_assert(not pending_entries, "Found pending entries in APPL_DB: {}".format(pending_entries))
E       Failed: Found pending entries in APPL_DB: ['_NEIGH_TABLE:Ethernet64:10.20.30.40']

ignore_entries = ['_ROUTE_TABLE:192.194.184.128/25', '_ROUTE_TABLE:193.58.224.128/25', '_ROUTE_TABLE:192.171.132.0/25', '_ROUTE_TABLE:193.34.108.128/25', '_ROUTE_TABLE:192.240.148.128/25', '_ROUTE_TABLE:192.180.138.0/25', ...]
orch_logrotate_setup = (['_ROUTE_TABLE:192.194.184.128/25', '_ROUTE_TABLE:193.58.224.128/25', '_ROUTE_TABLE:192.171.132.0/25', '_ROUTE_TABLE:193.34.108.128/25', '_ROUTE_TABLE:192.240.148.128/25', '_ROUTE_TABLE:192.180.138.0/25', ...], 'Ethernet64')
pending_entries = ['_NEIGH_TABLE:Ethernet64:10.20.30.40']
rand_selected_dut = <MultiAsicSonicHost r-leopard-79>
target_port = 'Ethernet64'

syslog/test_logrotate.py:285: Failed

The root cause is that there is a config reload in the teardown of the previous test case test_logrotate_small_size: https://github.com/sonic-net/sonic-mgmt/blob/92ea888a8fa3dc6a80f2867d3b79e587a8b6e667/tests/syslog/test_logrotate.py#L64 When the test test_orchagent_logrotate starts, the system is still processing thousands of routes, which make it impossible for the neighbor update to be processed as these are sequential. The test should start only after the system is in a stable state.

Describe the results you expected: Test should pass.

Additional information you deem important:

**Output of `show version`:**

```
(paste your output here)
```

**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
congh-nvidia commented 3 months ago

Hi @theasianpianist , could you please check this issue? Thanks.

theasianpianist commented 2 months ago

Hi @congh-nvidia sorry for the delay, we had an existing PR to address this issue that slipped, will merge it soon