Closed lxwinspur closed 1 year ago
@mzipse @geissonator @ojayanth FYI
Autoreboot is based on the policy , Should be true to initiate auto reboot during boot window.
root@xxxx:~# busctl get-property mapper get-service /xyz/openbmc_project/control/host0/auto_reboot
/xyz/openbmc_project/control/host0/auto_reboot xyz.openbmc_project.Control.Boot.RebootPolicy AutoReboot
b true
Also need to look the host reboot counter value, by default it is three. @geissonator can comment on the behaviour of this . upstream was got support update this via Redfish API incase value is not setting correctly.
Autoreboot is based on the policy , Should be true to initiate auto reboot during boot window. root@xxxx:~# busctl get-property
mapper get-service /xyz/openbmc_project/control/host0/auto_reboot
/xyz/openbmc_project/control/host0/auto_reboot xyz.openbmc_project.Control.Boot.RebootPolicy AutoReboot b true
Yes, I enabled auto_reboot and this problem still exists.
Also need to look the host reboot counter value, by default it is three. @geissonator can comment on the behaviour of this . upstream was got support update this via Redfish API incase value is not setting correctly.
Please provide a bmc dump, or at least a journal so we can see what's going on. Reboot policy is only utilized if we get far enough into the boot.
Please provide a bmc dump, or at least a journal so we can see what's going on. Reboot policy is only utilized if we get far enough into the boot.
Related logs and dump files are at https://github.com/ibm-openbmc/openbmc/issues/263
@lxwinspur I took at look at the logs, it appears you aren't testing with the latest 1030.ips code? I put a fix for the "why do we not switch to sbe side 1" issue up via https://github.com/ibm-openbmc/phosphor-state-manager/commit/39d5673d6e8bedd12ac34e5b034d7abd2b939e03 and I verified that bump is in the latest version of meta-phosphor/recipes-phosphor/state/phosphor-state-manager_git.bb in the 1030.ips but I don't see the new traces I added for that in the journal data from #263?
@geissonator
it appears you aren't testing with the latest 1030.ips code?
No, For this issue, I am based on the latest 1030.ips branch test(9a5e35fe9c1dbe8f278e728819abbf8e9c1f82ef)
Hmm, I'm not sure what's going on then @lxwinspur, if you look at my commit in https://github.com/ibm-openbmc/phosphor-state-manager/commit/39d5673d6e8bedd12ac34e5b034d7abd2b939e03 you can see the change I made to the log when that script does a quiesce. Your journal showed the older log (without the "and host crashed"). Please double check your level of firmware and maybe look at that script, host-reboot, on your system to ensure it has the new logic.
@sampmisr FYI
After updating and using the following solution, the problem is solved
https://github.com/ibm-openbmc/openbmc/commit/2a0c1837053f01c748d838b72185073dd75baf07
The current logic is: If the BMC reboot fails three times, it will automatically switch to SBE 1 (this logic considers that SBE 0 is broken)
In fact, we encountered a phenomenon: When the BMC executes
host power on
, it is found that SBE 0 is broken. The normal logic is that the BMC should automatically restart and try three times. If it fails, it will automatically switch to SBE1. But when the bmc fails topower on
for the first time, the bmc will be stuck after the SBE 0 startup fails, and the bmc will not be automatically restarted, so the BMC reboot will not be executed, which will not automatically switch to SBE 1Is this a problem?