sonic-net / sonic-mgmt

Configuration management examples for SONiC
Other
194 stars 712 forks source link

[Bug]: [syslog.test_logrotate][test_logrotate_small_size][20231110] failing with "AssertionError: Unexpected logrotate happens, there should be no logrotate executed" #13451

Closed vvolam closed 2 months ago

vvolam commented 3 months ago

Issue Description

test_logrotate_small_size testcase is failing with error "AssertionError: Unexpected logrotate happens, there should be no logrotate executed"

Results you see

31/05/2024 03:15:18 init._log_sep_line L0170 INFO | ==================== syslog/test_logrotate.py::test_logrotate_small_size call ==================== 31/05/2024 03:15:18 base._run L0071 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell, args=["df -k /var/log | sed -n 2p"], kwargs={} 31/05/2024 03:15:19 base._run L0108 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell Result => {"changed": true, "stdout": "/dev/loop2 93518 16504 69846 20% /var/log", "stderr": "", "rc": 0, "cmd": "df -k /var/log | sed -n 2p", "start": "2024-05-31 03:15:13.404511", "end": "2024-05-31 03:15:13.409673", "delta": "0:00:00.005162", "msg": "", "invocation": {"module_args": {"_raw_params": "df -k /var/log | sed -n 2p", "_uses_shell": true, "warn": false, "stdin_add_newline": true, "strip_empty_ends": true, "argv": null, "chdir": null, "executable": null, "creates": null, "removes": null, "stdin": null}}, "stdout_lines": ["/dev/loop2 93518 16504 69846 20% /var/log"], "stderr_lines": [], "_ansible_no_log": null, "failed": false} 31/05/2024 03:15:19 test_logrotate.run_logrotate L0121 DEBUG | Make sure there is no big /var/log/syslog exist by forcing execute logrotate 31/05/2024 03:15:19 test_logrotate.run_logrotate L0125 INFO | Run logrotate command: sudo /usr/sbin/logrotate -f /etc/logrotate.conf > /dev/null 2>&1 31/05/2024 03:15:19 base._run L0071 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell, args=["sudo /usr/sbin/logrotate -f /etc/logrotate.conf > /dev/null 2>&1"], kwargs={} 31/05/2024 03:15:20 base._run L0108 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell Result => {"changed": true, "stdout": "", "stderr": "", "rc": 0, "cmd": "sudo /usr/sbin/logrotate -f /etc/logrotate.conf > /dev/null 2>&1", "start": "2024-05-31 03:15:13.897693", "end": "2024-05-31 03:15:14.708520", "delta": "0:00:00.810827", "msg": "", "invocation": {"module_args": {"_raw_params": "sudo /usr/sbin/logrotate -f /etc/logrotate.conf > /dev/null 2>&1", "_uses_shell": true, "warn": false, "stdin_add_newline": true, "strip_empty_ends": true, "argv": null, "chdir": null, "executable": null, "creates": null, "removes": null, "stdin": null}}, "stdout_lines": [], "stderr_lines": [], "_ansible_no_log": null, "failed": false} 31/05/2024 03:15:20 test_logrotate.get_syslog_file_count L0098 INFO | Check rotated syslog file number 31/05/2024 03:15:20 base._run L0071 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell, args=["sudo ls -l /var/log | grep -Ec \"syslog\.[0-9]{1,4}[\.gz]{0,1}\""], kwargs={} 31/05/2024 03:15:21 base._run L0108 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell Result => {"changed": true, "stdout": "1", "stderr": "", "rc": 0, "cmd": "sudo ls -l /var/log | grep -Ec \"syslog\.[0-9]{1,4}[\.gz]{0,1}\"", "start": "2024-05-31 03:15:15.197635", "end": "2024-05-31 03:15:15.221191", "delta": "0:00:00.023556", "msg": "", "invocation": {"module_args": {"_raw_params": "sudo ls -l /var/log | grep -Ec \"syslog\.[0-9]{1,4}[\.gz]{0,1}\"", "_uses_shell": true, "warn": false, "stdin_add_newline": true, "strip_empty_ends": true, "argv": null, "chdir": null, "executable": null, "creates": null, "removes": null, "stdin": null}}, "stdout_lines": ["1"], "stderr_lines": [], "_ansible_no_log": null, "failed": false} 31/05/2024 03:15:21 test_logrotate.get_syslog_file_count L0100 DEBUG | There are 1 rotated syslog files 31/05/2024 03:15:21 test_logrotate.validate_logrotate_functi L0152 INFO | There are 1 syslog gz files 31/05/2024 03:15:21 test_logrotate.create_temp_syslog_file L0110 INFO | Create a temp syslog file as 921.6K 31/05/2024 03:15:21 base._run L0071 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell, args=["sudo fallocate -l 921.6K /var/log/syslog"], kwargs={} 31/05/2024 03:15:21 base._run L0108 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell Result => {"changed": true, "stdout": "", "stderr": "", "rc": 0, "cmd": "sudo fallocate -l 921.6K /var/log/syslog", "start": "2024-05-31 03:15:15.729336", "end": "2024-05-31 03:15:15.759673", "delta": "0:00:00.030337", "msg": "", "invocation": {"module_args": {"_raw_params": "sudo fallocate -l 921.6K /var/log/syslog", "_uses_shell": true, "warn": false, "stdin_add_newline": true, "strip_empty_ends": true, "argv": null, "chdir": null, "executable": null, "creates": null, "removes": null, "stdin": null}}, "stdout_lines": [], "stderr_lines": [], "_ansible_no_log": null, "failed": false} 31/05/2024 03:15:21 test_logrotate.run_logrotate L0125 INFO | Run logrotate command: sudo /usr/sbin/logrotate /etc/logrotate.conf > /dev/null 2>&1 31/05/2024 03:15:21 base._run L0071 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell, args=["sudo /usr/sbin/logrotate /etc/logrotate.conf > /dev/null 2>&1"], kwargs={} 31/05/2024 03:15:22 base._run L0108 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell Result => {"changed": true, "stdout": "", "stderr": "", "rc": 0, "cmd": "sudo /usr/sbin/logrotate /etc/logrotate.conf > /dev/null 2>&1", "start": "2024-05-31 03:15:16.263986", "end": "2024-05-31 03:15:16.369334", "delta": "0:00:00.105348", "msg": "", "invocation": {"module_args": {"_raw_params": "sudo /usr/sbin/logrotate /etc/logrotate.conf > /dev/null 2>&1", "_uses_shell": true, "warn": false, "stdin_add_newline": true, "strip_empty_ends": true, "argv": null, "chdir": null, "executable": null, "creates": null, "removes": null, "stdin": null}}, "stdout_lines": [], "stderr_lines": [], "_ansible_no_log": null, "failed": false} 31/05/2024 03:15:22 test_logrotate.get_syslog_file_count L0098 INFO | Check rotated syslog file number 31/05/2024 03:15:22 base._run L0071 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell, args=["sudo ls -l /var/log | grep -Ec \"syslog\.[0-9]{1,4}[\.gz]{0,1}\""], kwargs={} 31/05/2024 03:15:22 base._run L0108 DEBUG | /var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/common/devices/multi_asic.py::_run_on_asics#128: [str-msn4700-01] AnsibleModule::shell Result => {"changed": true, "stdout": "2", "stderr": "", "rc": 0, "cmd": "sudo ls -l /var/log | grep -Ec \"syslog\.[0-9]{1,4}[\.gz]{0,1}\"", "start": "2024-05-31 03:15:16.876812", "end": "2024-05-31 03:15:16.902779", "delta": "0:00:00.025967", "msg": "", "invocation": {"module_args": {"_raw_params": "sudo ls -l /var/log | grep -Ec \"syslog\.[0-9]{1,4}[\.gz]{0,1}\"", "_uses_shell": true, "warn": false, "stdin_add_newline": true, "strip_empty_ends": true, "argv": null, "chdir": null, "executable": null, "creates": null, "removes": null, "stdin": null}}, "stdout_lines": ["2"], "stderr_lines": [], "_ansible_no_log": null, "failed": false} 31/05/2024 03:15:22 test_logrotate.get_syslog_file_count L0100 DEBUG | There are 2 rotated syslog files 31/05/2024 03:15:22 test_logrotate.validate_logrotate_functi L0156 INFO | There are 2 syslog gz files after running logrotate 31/05/2024 03:15:22 init.pytest_runtest_call L0040 ERROR | Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/_pytest/python.py", line 1788, in runtest self.ihook.pytest_pyfunc_call(pyfuncitem=self) File "/usr/local/lib/python3.8/dist-packages/pluggy/_hooks.py", line 501, in call return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult) File "/usr/local/lib/python3.8/dist-packages/pluggy/_manager.py", line 119, in _hookexec return self._inner_hookexec(hook_name, methods, kwargs, firstresult) File "/usr/local/lib/python3.8/dist-packages/pluggy/_callers.py", line 138, in _multicall raise exception.with_traceback(exception.traceback) File "/usr/local/lib/python3.8/dist-packages/pluggy/_callers.py", line 102, in _multicall res = hook_impl.function(*args) File "/usr/local/lib/python3.8/dist-packages/_pytest/python.py", line 194, in pytest_pyfunc_call result = testfunction(**testargs) File "/var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/syslog/test_logrotate.py", line 230, in test_logrotate_small_size validate_logrotate_function(duthost, rotate_small_threshold) File "/var/src/sonic-mgmt_vms7-t1-4700-5_663c03f52c964d8bf34904c3/tests/syslog/test_logrotate.py", line 157, in validate_logrotate_function assert syslog_number_origin == syslog_number_no_rotate, \ AssertionError: Unexpected logrotate happens, there should be no logrotate executed

31/05/2024 03:15:22 init._log_sep_line L0170 INFO | ==================== syslog/test_logrotate.py::test_logrotate_small_size teardown ====================

Results you expected to see

25/06/2024 00:01:45 init._log_sep_line L0170 INFO | ==================== syslog/test_logrotate.py::test_logrotate_small_size call ==================== 25/06/2024 00:01:45 test_logrotate.run_logrotate L0129 INFO | Run logrotate command: sudo /usr/sbin/logrotate -f /etc/logrotate.conf > /dev/null 2>&1 25/06/2024 00:01:46 test_logrotate.get_syslog_file_count L0102 INFO | Check rotated syslog file number 25/06/2024 00:01:47 test_logrotate.validate_logrotate_functi L0156 INFO | There are 1 syslog gz files 25/06/2024 00:01:47 test_logrotate.create_temp_syslog_file L0114 INFO | Create a temp syslog file as 921.6K 25/06/2024 00:01:47 test_logrotate.run_logrotate L0129 INFO | Run logrotate command: sudo /usr/sbin/logrotate /etc/logrotate.conf > /dev/null 2>&1 25/06/2024 00:01:48 test_logrotate.get_syslog_file_count L0102 INFO | Check rotated syslog file number 25/06/2024 00:01:48 test_logrotate.validate_logrotate_functi L0160 INFO | There are 1 syslog gz files after running logrotate 25/06/2024 00:01:48 test_logrotate.create_temp_syslog_file L0114 INFO | Create a temp syslog file as 1126.4K 25/06/2024 00:01:49 test_logrotate.run_logrotate L0129 INFO | Run logrotate command: sudo /usr/sbin/logrotate /etc/logrotate.conf > /dev/null 2>&1 25/06/2024 00:01:49 test_logrotate.get_syslog_file_count L0102 INFO | Check rotated syslog file number 25/06/2024 00:01:50 test_logrotate.validate_logrotate_functi L0169 INFO | There are 2 syslog gz files after running logrotate 25/06/2024 00:01:50 init._log_sep_line L0170 INFO | ==================== syslog/test_logrotate.py::test_logrotate_small_size teardown ====================

Is it platform specific

generic

Relevant log output

No response

Output of show version

No response

Attach files (if any)

No response

bingwang-ms commented 3 months ago

@echuawu Can you help take a look?

echuawu commented 3 months ago

Could you provide the image version?

vvolam commented 3 months ago

Test is failing on both 20230531 and 20231110 OS versions randomly.

FangyiChenNV commented 3 months ago

Hi, could you please provide detailed hardware device information, disk usage, and test parameters so that we can reproduce this bug?

vvolam commented 3 months ago

@FangyiChenNV, Here is the information of testbed for a recent failure and no additional parameters passed to the test script:

admin@<>$ show ver

SONiC Software Version: SONiC.20231110.14 SONiC OS Version: 11 Distribution: Debian 11.9 Kernel: 5.10.0-23-2-amd64 Build commit: bf5875a969 Build date: Wed Jun 26 19:07:16 UTC 2024 Built by: cloudtest@3061438cc000004

Platform: x86_64-mlnx_msn4600c-r0 HwSKU: Mellanox-SN4600C-C64 ASIC: mellanox ASIC Count: 1 Serial Number: MT2115X26247 Model Number: MSN4600-CS2RO Hardware Revision: A1 Uptime: 18:03:39 up 10:13, 1 user, load average: 0.67, 0.61, 0.72 Date: Fri 28 Jun 2024 18:03:39

Docker images: REPOSITORY TAG IMAGE ID SIZE docker-platform-monitor 20231110.14 85067293ac8d 807MB docker-platform-monitor latest 85067293ac8d 807MB docker-syncd-mlnx 20231110.14 57d0643c71ec 819MB docker-syncd-mlnx latest 57d0643c71ec 819MB docker-orchagent 20231110.14 761d46c552cc 339MB docker-orchagent latest 761d46c552cc 339MB docker-teamd 20231110.14 f965eca45bc3 327MB docker-teamd latest f965eca45bc3 327MB docker-fpm-frr 20231110.14 6f779ef70aa0 352MB docker-fpm-frr latest 6f779ef70aa0 352MB docker-dhcp-relay latest 87759dee19cf 311MB docker-macsec latest e05870d4555f 329MB docker-sonic-telemetry 20231110.14 909ff76406f6 369MB docker-sonic-telemetry latest 909ff76406f6 369MB docker-snmp 20231110.14 3e95f4cf5380 340MB docker-snmp latest 3e95f4cf5380 340MB docker-eventd 20231110.14 dc8cefd9df04 301MB docker-eventd latest dc8cefd9df04 301MB docker-router-advertiser 20231110.14 5669970e72d9 301MB docker-router-advertiser latest 5669970e72d9 301MB docker-sonic-restapi 20231110.14 aa724d19edd7 319MB docker-sonic-restapi latest aa724d19edd7 319MB docker-mux 20231110.14 3b0145b468eb 349MB docker-mux latest 3b0145b468eb 349MB docker-lldp 20231110.14 165f1e46fea4 343MB docker-lldp latest 165f1e46fea4 343MB docker-sonic-gnmi 20231110.14 6c94f6e90b36 369MB docker-sonic-gnmi latest 6c94f6e90b36 369MB docker-database 20231110.14 d570530e847a 301MB docker-database latest d570530e847a 301MB docker-acms 20231110.14 9c5a0cdbfcf5 348MB docker-acms latest 9c5a0cdbfcf5 348MB docker-vnet-monitor 20231110.14 07c8eaa0e6cb 324MB docker-vnet-monitor latest 07c8eaa0e6cb 324MB k8s.gcr.io/pause 3.5 ed210e3e4a5b 683kB

admin@<>$ show system-memory total used free shared buff/cache available Mem: 7836 4091 1312 188 2431 3275 Swap: 0 0 0 admin@<>$ df -h Filesystem Size Used Avail Use% Mounted on udev 3.9G 0 3.9G 0% /dev tmpfs 784M 47M 737M 6% /run root-overlay 28G 2.8G 25G 11% / /dev/sda3 28G 2.8G 25G 11% /host /dev/loop1 3.9G 212M 3.5G 6% /var/log tmpfs 3.9G 149M 3.7G 4% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup

Thanks, Vasundhara

FangyiChenNV commented 3 months ago

Hi, could you please provide the result of running the following command on your DUT : df -k /var/log | sed -n 2p

vvolam commented 3 months ago

@FangyiChenNV Please find the command output on the DUT here:

admin@<>:/var/log$ df -k /var/log | sed -n 2p /dev/loop1 4046560 177132 3643332 5% /var/log

FangyiChenNV commented 3 months ago

Could you please show me the content of /etc/logrotate.conf on the DUT?

vvolam commented 3 months ago

@FangyiChenNV

admin@<>$ cat /etc/logrotate.conf

# see "man logrotate" for details

# global options do not affect preceding include directives

# rotate log files weekly
weekly

# keep 4 weeks worth of backlogs
rotate 4

# create new (empty) log files after rotating old ones
create

# use date as a suffix of the rotated file
#dateext

# uncomment this if you want your log files compressed
#compress

# packages drop log rotation information into this directory
include /etc/logrotate.d

# system-specific logs may also be configured here
FangyiChenNV commented 2 months ago

Hi, could you please provide the dump file of the failing test? This may be an issue with the image or stems from some environmental factor, hence requiring detailed dump files for further analysis.

vvolam commented 2 months ago

@FangyiChenNV Based on my investigation, test where the failure is reported have less available memory and are very chatty sometimes. As the test is creating a threshold of 0.9, which is 921.6K file. Remaining 102.4K size is quickly crossed by the time log rotation happens. Can we reduce the threshold to 0.1?

diff --git a/tests/syslog/test_logrotate.py b/tests/syslog/test_logrotate.py
index d3b9f5f79..3fcd6eb40 100644
--- a/tests/syslog/test_logrotate.py
+++ b/tests/syslog/test_logrotate.py
@@ -154,7 +154,7 @@ def validate_logrotate_function(duthost, logrotate_threshold):
             logrotate_threshold)):
         syslog_number_origin = get_syslog_file_count(duthost)
         logger.info('There are {} syslog gz files'.format(syslog_number_origin))
-        create_temp_syslog_file(duthost, multiply_with_unit(logrotate_threshold, 0.9))
+        create_temp_syslog_file(duthost, multiply_with_unit(logrotate_threshold, 0.1))
         run_logrotate(duthost)
         syslog_number_no_rotate = get_syslog_file_count(duthost)
         logger.info('There are {} syslog gz files after running logrotate'.format(syslog_number_no_rotate))
bingwang-ms commented 2 months ago

Fixed by https://github.com/sonic-net/sonic-mgmt/pull/13795