ibm-openbmc / openbmc

https://github.com
Other
19 stars 51 forks source link

1030.ips: Host failed to power on #264

Closed lxwinspur closed 1 year ago

lxwinspur commented 1 year ago

We have a Rainier machine used to develop eBMC. Now we have encountered a problem. After refreshing the BMC Fw several times, there will be a problem of Host Power On failure. There is no output from the Host Console. According to the log, the SBE has been executed, but the BMC is still running. The getTID command sent by the Host to establish a heartbeat has not been received. I'm not sure if it's a communication problem here, or a problem with SBE failing to start. During this period, I changed the image of the IBM release version, factory reset, AC and other attempts, but the problem still exists (I am not sure whether there will be residual after refreshing Fw, is there a way similar to eflash to test this problem?)

After a few days, we didn't do any operation, the strange thing is that there is output from Host Console, do you know the reason? The attachment is the log of a normal boot and the log of a boot failure, please help to check, thank you

lxwinspur commented 1 year ago

p10_poweron.tar.gz

lxwinspur commented 1 year ago

@mzipse @anoo1 @gtmills FYI

ojayanth commented 1 year ago

@lxwinspur please share BMC user initiated dump.

lxwinspur commented 1 year ago

phosphor-debug-collector.tar.gz

ojayanth commented 1 year ago

Please include BMC dump , BMC terminal run "dreport -v" to capture this data.

lxwinspur commented 1 year ago

Please include BMC dump , BMC terminal run "dreport -v" to capture this data.

This is an occasional recurring issue, and now that the machine has been restored, we don't keep much log information. We will grab more information when it appears next time.

ojayanth commented 1 year ago

In future , it is good to include BMC user initiated dump as part github issue. That will help to narrow down the issue at subsystem level ( BMC/SBE/Host)

lxwinspur commented 1 year ago

After updating and using the following solution, the problem is solved

https://github.com/ibm-openbmc/openbmc/commit/2a0c1837053f01c748d838b72185073dd75baf07