Closed madscientist159 closed 6 years ago
Do you happen to have a Cronus debug connection? If not, we'll have to keep our eyes open for a fail here. Unfortunately this kind of fail is one of the most annoying to debug.
@dcrowell77 Yes, we have Cronus. How should we proceed with debug?
Downgrading hostboot to 1e784c03824d66dd76ee5effe16b55782c703599 appears to bypass this issue.
Downgrading hostboot to 1e784c0 appears to bypass this issue.
Which commit did you first notice the fail on?
Using our debug tools on top of Cronus, run all these from inside your Cronus session:
export PROJECT_ROOT=
I suspect that you'll see some kind of exception in the Printk output.
Failure is seen on GIT hash 739ec89c67cde105301ab9aa11adf2c420efa6eb
Thanks for the instructions, will see if I can get time to try it out shortly.
git bisect start
# good: [1e784c03824d66dd76ee5effe16b55782c703599] Handle early life PNOR fails in HBRT instead of hanging
git bisect good 1e784c03824d66dd76ee5effe16b55782c703599
# bad: [739ec89c67cde105301ab9aa11adf2c420efa6eb] When FSI initialized by SP only use enable reg for detection
git bisect bad 739ec89c67cde105301ab9aa11adf2c420efa6eb
# good: [744277d9a5c546340a011ea36a18471bd3cdcb85] Enhance p9_extract_sbe_rc
git bisect good 744277d9a5c546340a011ea36a18471bd3cdcb85
# bad: [18dba5172c7d022d5b5b119d758fe167868cb00d] PRD: getConnectedDimm support for MBA/MCA
git bisect bad 18dba5172c7d022d5b5b119d758fe167868cb00d
# good: [e84f5604125d704d098efbea74f8368060be593d] Ensure runtime lib is loaded for IPC_POPULATE_TPM_INFO_BY_NODE
git bisect good e84f5604125d704d098efbea74f8368060be593d
# bad: [cde4990515a7a190fca7a3eb9f722f74c12acdb2] Cleanup the fix for "zero length dump on single node systems".
git bisect bad cde4990515a7a190fca7a3eb9f722f74c12acdb2
# bad: [f5cd23d6c3be17356e0851ec5d5bb65cee48f15f] Mark Read-Only Partitions as Such
git bisect bad f5cd23d6c3be17356e0851ec5d5bb65cee48f15f
# first bad commit: [f5cd23d6c3be17356e0851ec5d5bb65cee48f15f] Mark Read-Only Partitions as Such
just confirming if that is indeed the commit that if reverted fixes the things.
Yep, it's f5cd23d6c3be17356e0851ec5d5bb65cee48f15f. If I revert that one commit, I can boot. Disable the revert and I cannot.
I posted something in the internal gerrit that backs out that commit... hopefully some Hostboot folk can point out how I'm incredibly wrong somehow. It looks like I'll have to bring this in as a patch in op-build
for a bit though, as otherwise it breaks booting my Boston DD2.2 system.
We are the process of updating our Talos PNOR to use latest upstream components. Built PNOR is not updating the SBE as expected, and is failing to IPL in hostboot. This is with latest hostboot GIT master.
Still investigating cause.