Closed wenyiz2021 closed 6 months ago
Did you see this with a manual reboot or during sonic-mgmt testing?
Which sonic-mgmt test was this and do you know what the expected and actual reboot causes were?
hi @kenneth-arista @patrickmacarthur I see this during mgmt testing. with manual reboot I was able to see reboot cause.
is it cold reboot cause not added?
admin@str2-7804-sup-1:~$ show reboot-cause history
Name Cause Time User Comment
------------------- -------------------------------------------------------------------------------------------- ------------------------------- ------ ---------
2023_07_12_00_23_34 Unknown N/A N/A N/A
2023_07_12_00_16_01 Watchdog (watchdog, description: gpi 6 detailed fault - watchdog, time: 2023-07-12 00:14:03) N/A N/A N/A
2023_07_11_23_16_40 reboot Tue 11 Jul 2023 11:14:36 PM UTC admin N/A
2023_07_11_18_29_41 Unknown N/A N/A N/A
2023_07_10_17_31_55 reboot Mon Jul 10 17:30:03 UTC 2023 admin N/A
2023_07_10_16_24_40 reboot Mon Jul 10 16:22:48 UTC 2023 admin N/A
2023_07_10_14_12_57 Unknown N/A N/A N/A
2023_07_10_14_02_32 Watchdog (watchdog, description: gpi 6 detailed fault - watchdog, time: 2023-07-10 14:00:38) N/A N/A N/A
2023_07_10_09_16_11 Unknown N/A N/A N/A
2023_07_10_09_09_25 Unknown
Which sonic-mgmt test was this and do you know what the expected and actual reboot causes were?
test_continuous_reboot[str2-7804-sup-1] AssertionError: got reboot-cause failed after rebooted by cold
manual reboot on sup could show reboot cause is reboot
, user is admin
:
admin@str2-7804-sup-1:~$ show reboot-cause history
Name Cause Time User Comment
------------------- -------------------------------------------------------------------------------------------- ------------------------------- ------ ---------
2023_07_12_19_33_57 reboot Wed Jul 12 19:32:01 UTC 2023 admin N/A
2023_07_12_19_11_22 reboot Wed 12 Jul 2023 07:09:20 PM UTC admin N/A
but on pipeline cold reboot it shows 'Unknown' for sup, user is N/A
2023_07_12_00_23_34 Unknown N/A N/A N/A
pipeline was running test_cold_reboot on sup around this time 00:13:51
I am unable to reproduce this issue with platform_tests/test_reboot.py::test_continuous_reboot
I'm still seeing this as of today:
if reboot_type is not None:
logging.info("Check reboot cause")
assert wait_until(MAX_WAIT_TIME_FOR_REBOOT_CAUSE, 20, 30, check_reboot_cause, dut, reboot_type), \
"got reboot-cause failed after rebooted by %s" % reboot_type
E AssertionError: got reboot-cause failed after rebooted by cold
dut = MultiAsicSonicHost str2-7804-sup-1 interfaces = {} interfaces_wait_time = 800 reboot_type = 'cold' xcvr_skip_list = {'str2-7804-lc3-1': [], 'str2-7804-lc5-1': [], 'str2-7804-lc7-1': [], 'str2-7804-sup-1': []}
platform_tests/test_reboot.py:143: AssertionError
Could you send the contents of /var/log/arista*.log
on the DUT when you see this failure (you can e-mail to pmacarthur@arista.com)?
Could you send the contents of
/var/log/arista*.log
on the DUT when you see this failure (you can e-mail to pmacarthur@arista.com)?
I will send out EOD as it is sup reboot failure, it needs to take whole chassis
this is not seen on another chassis sup, same sku
22:22:37 reboot.check_reboot_cause_history L0419 INFO | index: 0, reboot cause: 'reboot'|Non-Hardware (reboot|^reboot, reboot cause from DUT: reboot PASSED
@wenyiz2021 , @kenneth-arista - is this issue still there?
@rlhui this seems issue only with vms26, str2 chassis in our lab, there are some reboot tests fail on this chassis but ot seen on the str3 chassis. I'm closing this issue
AssertionError: got reboot-cause failed after rebooted by cold
@Staphylo, @patrickmacarthur @kenneth-arista