3mdeb / openpower-coreboot-docs

Documentation related to POWER9 coreboot porting effort
Other
12 stars 3 forks source link

Weird behavior with prebuilt-scat : no logs in output file created by redirection #81

Open tlaurion opened 1 year ago

tlaurion commented 1 year ago

Hello there @SergiiDmytruk @krystian-hebel

Trying to get scat working to collect logs when ssh connection is closing when power states are changing.

First things first, I came to realize that scat was killed by SIGHUP, so nohup is still required.

The instructions at: https://github.com/3mdeb/openpower-coreboot-docs/edit/main/devnotes/scat/README.md

Are not working, was going to propose a PR, but even with nohup, and scat being there between ssh disconnections, there are no logs collected?

Here are my modifications to your notes:

#Download scat-prebuilt from github and copy it over talos's BMC through SSH
$ wget https://github.com/3mdeb/openpower-coreboot-docs/raw/main/devnotes/scat/prebuilt-scat
$ chmod +x prebuilt-scat
$ scp prebuilt-scat root@talos:/tmp/scat
#Otherwise build and scp scat to /tmp/scat
$ ssh root@talos
#Run scat through nohup so that it is detached from SSH's TTY and can resist disconnects
#Uncompressed output for normal log trace without a debug build
$ nohup /tmp/scat > /tmp/boot-log &
#Compressed output for builds with debugging information when requested
$ nohup /tmp/scat | gzip > /tmp/boot-log.gz &

But neither scat compressed nor uncompressed boot-log output is being produced, neither by hostboot (flash vpnor) or coreboot+heads. The log file stays at file size of 0 bytes.

Of course, obmc-util-client works and outputs on console. But the whole goal of scat was to capture logs when obmc-util-client could not.

Can you replicate with prebuilt-scat ? Thanks.

SergiiDmytruk commented 1 year ago

Worked without nohup for me. Does running /tmp/scat print any logs to the terminal? It should work as obmc-console-client (maybe cache output a bit differently), just not require a tty. I wonder if we have the same BMC firmware version.

Firmware Version
    2.7.0-dev-571-g67efd9872
tlaurion commented 1 year ago

@SergiiDmytruk /tmp/scatprints nothing at launch.

obmcutil poweron on another terminal prints nothing in scat either. obmcutil poweroff disconnects ssh terminal but doesn't update scat output. Basically, prebuilt-scat is blind on my openbmc

Trying to figure out how to get openbmc version remotely from bmc

SergiiDmytruk commented 1 year ago

Trying to figure out how to get openbmc version remotely from bmc

As usual:

cat /etc/os-release

/tmp/scatprints nothing at launch.

That's expected, it should only print something after poweron.

tlaurion commented 1 year ago

@SergiiDmytruk

root@talos:~# cat /etc/os-release  
ID="openbmc-phosphor"
NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro)"
VERSION="2.7.0-dev"
VERSION_ID="2.7.0-dev-571-g67efd9872"
PRETTY_NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro) 2.7.0-dev"
BUILD_ID="2.7.0-dev"
OPENBMC_TARGET_MACHINE="talos"
tlaurion commented 1 year ago

@SergiiDmytruk So same version different behavior. Can you try the prebuilt-scat?

tlaurion commented 1 year ago

@SergiiDmytruk

root@talos:~# ps faux | grep obmc
root      1123  0.0  0.5   3120  2204 ttyS5    Ss+  Sep13   0:18 obmc-console-server --config /etc/obmc-console.conf ttyVUART0
root      1165  0.8  1.3  20744  5944 ?        Ssl  Sep13 198:04 /usr/bin/obmc-ikvm -v /dev/video0 -k /dev/hidg0 -p /dev/hidg1
root      1236  0.0  0.4   2976  1996 ?        Ss   Sep13   0:04 /usr/sbin/obmc-uart-render-controller /sys/devices/platform/ahb/ahb:apb/1e6e2000.syscon/1e6e2000.syscon:scratch/vga0 /sys/devices/platform/ahb/ahb:apb/1e6e2000.syscon/1e6e2000.syscon:scratch/dac_mux fbterm.service
root     11780  0.0  0.2   2944  1284 pts/1    S+   18:16   0:00      \_ grep obmc
tlaurion commented 1 year ago

@SergiiDmytruk

root@talos:~# cat /etc/obmc-console.conf
lpc-address = 0x3f8
sirq = 4
local-tty = ttyS0
local-tty-baud = 115200
SergiiDmytruk commented 1 year ago

It works here:

root@talos:~# wget https://github.com/3mdeb/openpower-coreboot-docs/raw/main/devnotes/scat/prebuilt-scat               
Connecting to github.com (140.82.121.4:443)                                                                            
wget: note: TLS certificate validation not implemented                                                                 
Connecting to raw.githubusercontent.com (185.199.109.133:443)                                                          
prebuilt-scat        100% |***********************************************************************************************************************************************************************************************|  8192  0:00:00 ETA
root@talos:~# chmod +x prebuilt-scat                                                                                   
root@talos:~# ./prebuilt-scat                                                                                          

coreboot-raptor-cs_talos-2_v0.6.0 Fri Aug 19 15:00:24 UTC 2022 bootblock starting (log level: 7)...                    
FFS header at 0x8006000ffff7000                                                                                        
PNOR base at 0x80060300fc000000                                                                                        
HBI partition has ECC                                                                                                  
HBI is in 0x00426200 through 0x0175f037              

And /etc/obmc-console.conf is the same.

tlaurion commented 1 year ago

killing obmc-console-server made /tmp/scat work without redirection or putting in background, outputing coreboot+heads logs on console until those lines:

[   16.425424317,7] LC: Routing irq 4, policy: 0 (r=1)
[   16.425425388,7] LPC: SerIRQ 4 using route 1 targetted at OPAL

Then obmcutil poweroff and /tmp/scat until

[   15.226606635,7] LC: Routing irq 4, policy: 0 (r=1)
[   15.226607571,7] LPC: SerIRQ 4 using route 1 targetted at OPAL
Timeout, server server.onion not responding.

So socket on which scat is binding for input is ok without killing obmc-console-server.

Trying to redirect output and putting in background without compression: First console: /tmp/scat > /tmp/boot-log & Second console: tail -f /tmp/boot-log Third console:....obmcutil poweron

Meanshile and prior launching poweron, first console lost ssh. So ps faux | grep scat on third console confirms that scat died.

Redoing.... First console: /tmp/scat Second console: obmcutil poweron Output on first console up until:

[   15.228202173,5] STB: trusted boot not supported
[   15.228243613,7] LC: Routing irq 4, policy: 0 (r=1)
[   15.228244613,7] LPC: SerIRQ 4 using route 1 targetted at OPAL

So we still have socket.


Redoing with redirection First console: rm /tmp/boot-log && /tmp/scat > /tmp/boot-log & Second console: tail -f /tmp/boot-log Third console: obmcutil poweron

Return to second console: Shows output up until ssh disconnection:

Loading segment from ROM address 0xf8380000
  code (compression=1)
  New segment dstaddr 0x00000000 memize 0x173c00 srcaddr 0xf8380054 filesize 0x632a5
Loading Segment: addr: 0x00000000 memsz: 0x00000000Timeout, server

Unfortunately, scat got killed with ssh session.


Redoing with redirection and nohup (no compression) (issuing obmcutil poweroff on first console and waiting for ssh to return is implied between each test)

First console: rm /tmp/boot-log && nohup /tmp/scat > /tmp/boot-log & Second console: tail -f /tmp/boot-log Third console: obmcutil poweron

Return to second console: Shows output up until ssh disconnection:

Loading segment from ROM address 0xf8380000
  code (compression=1)
  New segment dstaddr 0x00000000 memize 0x173c00 srcaddr 0xf8380054 filesize 0x632a5
Loading Segment: addr: 0x00000000 memsz: 0x0000000Timeout, server 

Automatic reconnection on First console shows:

root@talos:~# ps faux | grep scat
root     14450  0.0  0.2   1720   952 ?        S    19:16   0:00 /tmp/scat
root     14753  0.0  0.2   2944  1284 pts/2    S+   19:18   0:00      \_ grep scat

Good. nohup made scat survive

Automatic reconnection on Second console shows:

root@talos:~# cat /tmp/boot-log
st type 0x

coreboot--TIMELESS--LESSTIME--Heads-v5.0.1-212-g585e581d Thu Jan  1 00:00:00 UTC 1970 bootblock starting (log leel: 7)...
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
HBI partition has ECC
HBI is in 0x00426200 throgh 0x0175f037
FMAP: Found "FLASH" version 1.1 at 0x20000.
FMAP: base = 0x0 size = 0x200000 #areas = 4
FMAP: area COREBOOT fund @ 20200 (1965568 bytes)
CBFS: mcache @0xf8231000 built for 10 files, used 0x1f0 of 0x2000 bytes
CBFS: Found 'fallback/rostage' @0x80 size 0x124a1 in mcache @0f823102c
BS: bootblock times (exec / console): total (unknown) / 2 ms

coreboot--TIMELESS--LESSTIME--Heads-v5.0.1-212-g58e581d Thu Jan  1 00:00:00 UTC 1970 romstage starting (log level: 7)...
IPMI: romstage PNP BT 0xe4
Get BMC self test result..Function Not Implemented
Initializing IPMI BMC watchog timer
IPMI BMC watchdog initialized and started.
Initializing FSI...
Initialized FSI (chips mask: 0x01)
BuildingMVPDs...
starting istep 8.1
starting istep 8.2
starting istep 8.3
starting istep 8.4
starting istep 8.9
Base epsilon vales read from table:
 R_T[0] = 22
 R_T[1] = 22
 R_T[2] = 103
 W_T[0] = 5
 W_T[1] = 33
Scaled epsilon values based on +20 ercent guardband:
 R_T[0] = 27
 R_T[1] = 27
 R_T[2] = 124
 W_T[0] = 6
 W_T[1] = 40
starting istep 8.10
starting istep 811
starting istep 9.2
starting istep 9.4
starting istep 9.6
starting istep 9.7
starting istep 10.1
starting istep 10.6
tarting isep 10.10
starting istep 10.12
starting istep 10.13
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
MEMD artition has ECC
MEMD is in 0x03cef200 through 003cfb917
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address 51
I2C transfer failed to complete 0x04011f0104000000)
No memory DIMM at address 52
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at addess 53
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address D4
I2C transfer failed to complete (004011f0104000000)
No memory DIMM at address D5
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at addres D6
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address D7
SPD @ 0x50
SPD: module type is DDR4
SPD: module part number is M393A1K43BB0-CRC    
SPD: banks 16, ranks 1, rows 16, columns 10, density 8192 Mb
SPD: device with 8 bits, bus width 64 bits
SPD: module size is 8192 MB (per channel)
starting istep 13.1
starting itep 13.2
starting istep 13.3
starting itep 13.4
starting istep 13.5
starting istep 13.6
starting istep 13.7
starting istep 13.8
starting istep 13.9
starting isep 13.10
CCS took 2 us (3 us timeout), 1 instrction(s)
CCS took 2 us (2 us timeout), 14 instrction(s)
RCD dump for I2C address 0x58:
0xf820fa30: 80 b3 40 42 30 00 00 00 02 01 00 03 cb e3 c0 0d  ..@B0...........
0xf82fa40: 00 00 39 00 00 00 00 00 00 00 07 00 00 00 00 00  ..9.............
starting istep 13.11
CCS took 2 us (7 us timeout), 2instruction(s)
Write Leveling starting
CCS took 12 us (92 us timeout), 5 instruction(s)
Write Leveling done
Initial Patter Write starting
CCS took 5 us (38 us timeout), 5 instruction(s)
Initial Pattern Write done
DQS alignment starting
CCS took10 us (44 us timeout), 1 instruction(s)
DQS alignment done
Read Clock Alignment starting
CCS took 8 us (82 us timeout), 1 istruction(s)
Read Clock Alignment done
Read Centering starting
CCS took 36 us (120 us timeout), 1 instruction(s)
Read Centring done
Write Centeringstarting
CCS took 8404 us (11314 us timeout), 7 instruction(s)
Write Centering done
Coarse write/read starting
CCS took 5 s (24 us timeout), 1 instruction(s)
Coarse write/read done
starting istep 13.12
starting istep 13.13
starting itep 14.1
starting istep 14.2
starting istep 14.3
Initializing PEC0...
Initializing PEC1...
Initializing PEC2...
Initialiing PHB0...
Initializing PHB1...
Initializing PHB2...
Initializing PHB3...
Initializing PHB4...
Initializing PHB5...
stating istep 14.4
starting istep 14.5
0xF000F = 221d104900008040
CBMEM:
IMD: root @ 0xffeff000 254 entries.
IMD: root @ 0xfefec00 62 entries.
FMAP: area COREBOOT found @ 20200 (1965568 bytes)
FFS header at 0x80060300ffff7000
PNOR base at 0x8006030fc000000
HBI partition has ECC
HBI is in 0x00426200 through 0x0175f037
CBFS: Found 'fallback/ramstage' @0x12580 size 0xcca in mcache @0f8231080
BS: romstage times (exec / console): total (unknown) / 17 ms

coreboot--TIMELESS--LESSTIME--Heads-v5.0.1-212-g58e581d Thu Jan  1 00:00:00 UTC 1970 ramstage starting (log level: 7)...
Enumerating buses...
Root Device scanning...
DD21, botcore: 18
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
HCODE partition has ECC
HCODE is in 0x01a82200 trough 001b821ff
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
OCC partition has ECC
OCC is in 0x03822200 throuh 0039221ff
Base epsilon values read from table:
 R_T[0] = 22
 R_T[1] = 22
 R_T[2] = 103
 W_T[0] = 5
 W_T[1] = 33
Scaled esilon values based on +20 percent guardband:
 R_T[0] = 27
 R_T[1] = 27
 R_T[2] = 124
 W_T[0] = 6
 W_T[1] = 40
FFS headerat 0x80060300ffff7000
PNOR base at 0x80060300fc000000
WOFDATA partition has ECC
WOFDATA is in 0x039ea200 through 0x03c94ca7
Matching WOF tables section not found, disabling WOF
Safe mode freq = 1833412 kHZ, voltage = 711 mv
starting istep 15.2
strting istep 15.3
starting istep 15.4
Starting PM cmplex...
Attempting PGPE activation...
PGPE was activated successfully
Done starting PM complex
Activatig OCC...
Done activating OCC
starting istep 16.1
XIVE configured, entering deadman loop
ASSERTION ERROR: file '(filenames not available on timeless builds)',line 404
starting istep 18.11
starting istep 18.12
CPU_CLUSTER: 0 enabled
PNP: 00e4.0 enabled
scan_bus: bus Root Device fnished in 3071 msecs
done
BS: BS_DEV_ENUMERATE run times (exec / console): 3065 / 7 ms
Allocating resources...
Reading resurces...
CPU_CLUSTER: 0 missing read_resources
Done reading resources.
CPU_CLUSTER: 0 missing set_resources
PNP: 00e4.0 00<- [0x00000000e4 - 0x00000000e6] size 0x00000003 gran 0x00 io
Done setting resources.
Done allocating resources.
BS: BS_DEVRESOURCES run times (exec / console): 0 / 1 ms
Enabling resources...
done.
Initializing devices...
PNP: 00e4.0 init
IPMI:PNP BT 0xe4
Get BMC self test result...Function Not Imlemented
IPMI: Found man_id 0x6cb0000, prod_id 0x1000000
IPMI: Version 2.0
PNP: 00e4.0 init finished in 22 msecs
Devices iitialized
BS: BS_DEV_INIT run times (exec / console): 22 / 1 ms
Finalize devices...
Devices finalized
Writing coreboot tabe at 0xffed9000
 0. 0000000000000000-00000000f8207fff: RAM
 1. 00000000f8208000-00000000f820ffff: RAMSTAGE
 2. 00000000f821000-00000000f8ffffff: RAM
 3. 00000000f9000000-00000000f929bfff: RAMSTAGE
 4. 00000000f929c000-00000000ffed8fff: RAM
 5. 0000000ffed9000-00000000ffefffff: CONFIGURATION TABLES
 6. 00000000fff00000-00000001ff3fffff: RAM
 7. 00000001ff400000-0000000ffffffff: RESERVED
FMAP: area COREBOOT found @ 20200 (1965568 bytes)
FFS header at 0x80060300ffff7000
PNOR base at 0x800603fc000000
HBI partition has ECC
HBI is in 0x00426200 through 0x0175f037
Wrote coreboot table at: 0xffed9000, 0x214 bytes, chcksum c9ae
coreboot table: 556 bytes.
IMD ROOT    0. 0xffeff000 0x00001000
IMD SMALL   1. 0xffefe000 0x00001000
CONSOLE    2. 0xffede000 0x00020000
MEM INFO    3. 0xffedb000 0x000022b0
COREBOOT    4. 0xffed9000 0x00002000
IMD small region:
  IM ROOT    0. 0xffefec00 0x00000400
  RO MCACHE   1. 0xffefea00 0x000001f0
  FMAP        2. 0xffefe920 0x000000e0
BS: BS_WRIT_TABLES run times (exec / console): 1 / 5 ms
CBFS: Found 'fallback/payload' @0x257c0 size 0x632f9 in mcache @0ffefeb84
Checking segment from ROM address 0xf8380000
Checking segment from ROM address 0xf838001c
Checking segment from RO address 0xf8380038
Loading segment from ROM address 0xf8380000
  code (compression=1)
  New segment dstaddr 0x00000000 memize 0x173c00 srcaddr 0xf8380054 filesize 0x632a5
Loading Segment: addr: 0x00000000 memsz: 0x0000000000173c00 filesz: 0x0000000000632a5
uing LZMA
Loading segment from ROM address 0xf838001c
  BSS 0x00300000 (113776 byte)
Loading Segment: addr: 0x00300000 memsz 0x000000000001bc70 filesz: 0x0000000000000000
it's not compressed!
Clearing Segment: addr: 0x0000000000300000 memsz: 0x00000000001bc70
Loading segment from ROM address 0xf8380038
  Entry Point 0x00002570
BS: BS_PAYLOAD_LOAD run times (exec / consle): 301 / 4 ms
Jumping to boot code at 0x00002570(0xffed9000)
CBFS: Found '1-cpu.dtb' @0x1f700 size 0x24e7 in mcache @0[   11.303765993,5] OPAL v6.3-rc2-32-gfa060c2c starting...
[   11.303771495,6] CPU: P9 generation processor (max 4 threads/core)
[   11.303773282,7] CPU: Boot CPU PIR is 0x0048 PVR is x004e1201
[   11.303775782,7] OPAL table: 0x30108230 .. 0x30108810, branch table: 0x30002000
[   11.303778871,7] Assigning pysical memory map table for nimbus
[   11.303781301,7] FDT: Parsing fdt @0xf90a55f8
[   11.304483067,6] CHIP: Initialised chp 0 from xscom@603fc00000000
[   11.304507066,6] P9 DD2.10 detected
[   11.304510618,7] XSCOM: Base address: 0x603fc00000000
[   11.304519370,7] XSTOP: XSCOM addr = 0x5012000, FIR bit = 31
[  11.304521472,6] MFSI 0:0: Initialized
[   11.304522881,6] MFSI 0:2: Initialized
[   11.304524288,6] MFSI 0:1: Initialized
[  11.304578142,6] LPC: LPC[000]: Initialized
[   11.304579551,7] LPC: access via MMIO @0x6030000000000
[   11.304592464,7] LC: Default bus on chip 0x0
[   11.304599342,7] CPU: New max PIR set to 0x57
[   11.304629709,6] MEM: parsing reserved memoryfrom node /reserved-memory
[   11.304634034,7] HOMER: Init chip 0
[   11.304636412,7]   PBA BAR0 : 0x00000001ff400000
[   1.304638152,7]   PBA MASK0: 0x0000000000300000
[   11.304639854,7]   HOMER Image at 0x1ff400000 size 4MB
[   11.304642741,7]  PBA BAR2 : 0x00000001ff800000
[   11.304644421,7]   PBA MASK2: 0x0000000000700000
[   11.304646051,7]   OCC Common Area at x1ff800000 size 8MB
[   11.304648383,7] CPU: decrementer bits 56
[   11.304652894,6] CPU: CPU from DT PIR=0x0048 Server#=0x4 State=3
[   11.304658200,6] CPU:  4 secondary threads
[   11.304666828,6] CPU:  4 secondary threads4c Server#=0x4c State=3
[   11.304669816,6] CPU: CPU from DT PIR=0x0050 Server#=0x50 State=3
[   11.30675296,6] CPU:  4 secondary threads
[   11.304678298,6] CPU: CPU from DT PIR=0x0054 Server#=0x54 State=3
[   11.304683802,6]CPU:  4 secondary threads
[   11.304731310,6] PLAT: AST SIO unavailable!
[   11.304743805,7] UART: Using LPC IRQ 4
[   11.36811831,5] PLAT: Detected Talos platform
[   11.414788292,7] LPC: Routing irq 10, policy: 0 (r=1)openbmc
[   12.317685306,5] HIOMAP: Negotiated hiomap protocol v2d at OPAL
[   12.317744661,5] HIOMAP: Block size is 4KiB
[   12.317774109,5]HIOMAP: BMC suggested flash timeout of 8s
[   12.317816900,5] HIOMAP: Flash size is 64MiB
[   15.016010140,5] STB: secure boot not supportednamed) id:0
[   15.016066780,5] STB: trusted boot not supported
[   15.016119840,7] LC: Routing irq 4, policy: 0 (r=1)
[   15.016120765,7] LPC: SerIRQ 4 using route 1 targetted at OPAL
[   18.214036870,3] PCI: PHB  (80000) not foundining links...1 secureboot not yet initialized
[   18.214100054,3] PCI: PHB  (80001) not found
[   18.214141972,3] PCI: PHB (80003) not found
[   18.214181249,5root@talos:~#

Ok.... So seems like socket is dead at this point. But scat is still alive. So Second console: tail -f /tmp/boot-log Third console: obmcutil poweroff

On second console:

root@talos:~# tail -f  /tmp/boot-log
[   12.317774109,5]HIOMAP: BMC suggested flash timeout of 8s
[   12.317816900,5] HIOMAP: Flash size is 64MiB
[   15.016010140,5] STB: secure boot not supportednamed) id:0
[   15.016066780,5] STB: trusted boot not supported
[   15.016119840,7] LC: Routing irq 4, policy: 0 (r=1)
[   15.016120765,7] LPC: SerIRQ 4 using route 1 targetted at OPAL
[   18.214036870,3] PCI: PHB  (80000) not foundining links...1 secureboot not yet initialized
[   18.214100054,3] PCI: PHB  (80001) not found
[   18.214141972,3] PCI: PHB (80003) not found
[   18.214181249,5Timeout, server 

So socket listened from scat can be confirmed dead.

So: 1- without nohup, scat dies with the socket. 2- with nohup, scat survives the socket dying but is useless at this point.

Any insight? scat doesn't resolve the problem.

Console 1: obmcutil poweron && obmc-console-client

root@talos:~# obmcutil poweron && obmc-console-client 
st type 0x

coreboot--TIMELESS--LESSTIME--Heads-v5.0.1-212-g585e581d Thu Jan  1 00:00:00 UTC 1970 bootblock starting (log level: 7)...
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
HBI partition has ECC
HBI is in 0x00426200 through 0x0175f037
FMAP: Found "FLASH" version 1.1 at 0x20000.
FMAP: base = 0x0 size = 0x200000 #areas = 4
FMAP: area COREBOOT found @ 20200 (1965568 bytes)
CBFS: mcache @0xf8231000 built for 10 files, used 0x1f0 of 0x2000 bytes
CBFS: Found 'fallback/romstage' @0x80 size 0x124a1 in mcache @0xf823102c
BS: bootblock times (exec / console): total (unknown) / 2 ms

coreboot--TIMELESS--LESSTIME--Heads-v5.0.1-212-g585e581d Thu Jan  1 00:00:00 UTC 1970 romstage starting (log level: 7)...
IPMI: romstage PNP BT 0xe4
Get BMC self test result...Function Not Implemented
Initializing IPMI BMC watchdog timer
IPMI BMC watchdog initialized and started.
Initializing FSI...
Initialized FSI (chips mask: 0x01)
Building MVPDs...
starting istep 8.1
starting istep 8.2
starting istep 8.3
starting istep 8.4
starting istep 8.9
Base epsilon values read from table:
 R_T[0] = 22
 R_T[1] = 22
 R_T[2] = 103
 W_T[0] = 5
 W_T[1] = 33
Scaled epsilon values based on +20 percent guardband:
 R_T[0] = 27
 R_T[1] = 27
 R_T[2] = 124
 W_T[0] = 6
 W_T[1] = 40
starting istep 8.10
starting istep 8.11
starting istep 9.2
starting istep 9.4
starting istep 9.6
starting istep 9.7
starting istep 10.1
starting istep 10.6
starting istep 10.10
starting istep 10.12
starting istep 10.13
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
MEMD partition has ECC
MEMD is in 0x03cef200 through 0x03cfb917
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address 51
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address 52
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address 53
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address D4
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address D5
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address D6
I2C transfer failed to complete (0x04011f0104000000)
No memory DIMM at address D7
SPD @ 0x50
SPD: module type is DDR4
SPD: module part number is M393A1K43BB0-CRC    
SPD: banks 16, ranks 1, rows 16, columns 10, density 8192 Mb
SPD: device width 8 bits, bus width 64 bits
SPD: module size is 8192 MB (per channel)
starting istep 13.1
starting istep 13.2
starting istep 13.3
starting istep 13.4
starting istep 13.5
starting istep 13.6
starting istep 13.7
starting istep 13.8
starting istep 13.9
starting istep 13.10
CCS took 2 us (3 us timeout), 1 instruction(s)
CCS took 2 us (2 us timeout), 14 instruction(s)
RCD dump for I2C address 0x58:
0xf820fa30: 80 b3 40 42 30 00 00 00 02 01 00 03 cb e3 c0 0d  ..@B0...........
0xf820fa40: 00 00 39 00 00 00 00 00 00 00 07 00 00 00 00 00  ..9.............
starting istep 13.11
CCS took 2 us (7 us timeout), 2 instruction(s)
Write Leveling starting
CCS took 12 us (92 us timeout), 5 instruction(s)
Write Leveling done
Initial Pattern Write starting
CCS took 5 us (38 us timeout), 5 instruction(s)
Initial Pattern Write done
DQS alignment starting
CCS took 10 us (44 us timeout), 1 instruction(s)
DQS alignment done
Read Clock Alignment starting
CCS took 8 us (82 us timeout), 1 instruction(s)
Read Clock Alignment done
Read Centering starting
CCS took 37 us (120 us timeout), 1 instruction(s)
Read Centering done
Write Centering starting
CCS took 7646 us (11314 us timeout), 7 instruction(s)
Write Centering done
Coarse write/read starting
CCS took 5 us (24 us timeout), 1 instruction(s)
Coarse write/read done
starting istep 13.12
starting istep 13.13
starting istep 14.1
starting istep 14.2
starting istep 14.3
Initializing PEC0...
Initializing PEC1...
Initializing PEC2...
Initializing PHB0...
Initializing PHB1...
Initializing PHB2...
Initializing PHB3...
Initializing PHB4...
Initializing PHB5...
starting istep 14.4
starting istep 14.5
0xF000F = 221d104900008040
CBMEM:
IMD: root @ 0xffeff000 254 entries.
IMD: root @ 0xffefec00 62 entries.
FMAP: area COREBOOT found @ 20200 (1965568 bytes)
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
HBI partition has ECC
HBI is in 0x00426200 through 0x0175f037
CBFS: Found 'fallback/ramstage' @0x12580 size 0xcca6 in mcache @0xf8231080
BS: romstage times (exec / console): total (unknown) / 17 ms

coreboot--TIMELESS--LESSTIME--Heads-v5.0.1-212-g585e581d Thu Jan  1 00:00:00 UTC 1970 ramstage starting (log level: 7)...
Enumerating buses...
Root Device scanning...
DD21, boot core: 18
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
HCODE partition has ECC
HCODE is in 0x01a82200 through 0x01b821ff
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
OCC partition has ECC
OCC is in 0x03822200 through 0x039221ff
Base epsilon values read from table:
 R_T[0] = 22
 R_T[1] = 22
 R_T[2] = 103
 W_T[0] = 5
 W_T[1] = 33
Scaled epsilon values based on +20 percent guardband:
 R_T[0] = 27
 R_T[1] = 27
 R_T[2] = 124
 W_T[0] = 6
 W_T[1] = 40
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
WOFDATA partition has ECC
WOFDATA is in 0x039ea200 through 0x03c94ca7
Matching WOF tables section not found, disabling WOF
Safe mode freq = 1833412 kHZ, voltage = 711 mv
starting istep 15.2
starting istep 15.3
starting istep 15.4
Starting PM complex...
Attempting PGPE activation...
PGPE was activated successfully
Done starting PM complex
Activating OCC...
Done activating OCC
starting istep 16.1
XIVE configured, entering dead man loop
ASSERTION ERROR: file '(filenames not available on timeless builds)', line 404
starting istep 18.11
starting istep 18.12
CPU_CLUSTER: 0 enabled
PNP: 00e4.0 enabled
scan_bus: bus Root Device finished in 3073 msecs
done
BS: BS_DEV_ENUMERATE run times (exec / console): 3068 / 6 ms
Allocating resources...
Reading resources...
CPU_CLUSTER: 0 missing read_resources
Done reading resources.
CPU_CLUSTER: 0 missing set_resources
PNP: 00e4.0 00 <- [0x00000000e4 - 0x00000000e6] size 0x00000003 gran 0x00 io
Done setting resources.
Done allocating resources.
BS: BS_DEV_RESOURCES run times (exec / console): 0 / 1 ms
Enabling resources...
done.
Initializing devices...
PNP: 00e4.0 init
IPMI: PNP BT 0xe4
Get BMC self test result...Function Not Implemented
IPMI: Found man_id 0x6cb0000, prod_id 0x1000000
IPMI: Version 2.0
PNP: 00e4.0 init finished in 36 msecs
Devices initialized
BS: BS_DEV_INIT run times (exec / console): 36 / 2 ms
Finalize devices...
Devices finalized
Writing coreboot table at 0xffed9000
 0. 0000000000000000-00000000f8207fff: RAM
 1. 00000000f8208000-00000000f820ffff: RAMSTAGE
 2. 00000000f8210000-00000000f8ffffff: RAM
 3. 00000000f9000000-00000000f929bfff: RAMSTAGE
 4. 00000000f929c000-00000000ffed8fff: RAM
 5. 00000000ffed9000-00000000ffefffff: CONFIGURATION TABLES
 6. 00000000fff00000-00000001ff3fffff: RAM
 7. 00000001ff400000-00000001ffffffff: RESERVED
FMAP: area COREBOOT found @ 20200 (1965568 bytes)
FFS header at 0x80060300ffff7000
PNOR base at 0x80060300fc000000
HBI partition has ECC
HBI is in 0x00426200 through 0x0175f037
Wrote coreboot table at: 0xffed9000, 0x214 bytes, checksum c9ae
coreboot table: 556 bytes.
IMD ROOT    0. 0xffeff000 0x00001000
IMD SMALL   1. 0xffefe000 0x00001000
CONSOLE     2. 0xffede000 0x00020000
MEM INFO    3. 0xffedb000 0x000022b0
COREBOOT    4. 0xffed9000 0x00002000
IMD small region:
  IMD ROOT    0. 0xffefec00 0x00000400
  RO MCACHE   1. 0xffefea00 0x000001f0
  FMAP        2. 0xffefe920 0x000000e0
BS: BS_WRITE_TABLES run times (exec / console): 1 / 5 ms
CBFS: Found 'fallback/payload' @0x257c0 size 0x632f9 in mcache @0xffefeb84
Checking segment from ROM address 0xf8380000
Checking segment from ROM address 0xf838001c
Checking segment from ROM address 0xf8380038
Loading segment from ROM address 0xf8380000
  code (compression=1)
  New segment dstaddr 0x00000000 memsize 0x173c00 srcaddr 0xf8380054 filesize 0x632a5
Loading Segment: addr: 0x00000000 memsz: 0x0000000000173c00 filesz: 0x00000000000632a5
using LZMA
Loading segment from ROM address 0xf838001c
  BSS 0x00300000 (113776 byte)
Loading Segment: addr: 0x00300000 memsz: 0x000000000001bc70 filesz: 0x0000000000000000
it's not compressed!
Clearing Segment: addr: 0x0000000000300000 memsz: 0x000000000001bc70
Loading segment from ROM address 0xf8380038
  Entry Point 0x00002570
BS: BS_PAYLOAD_LOAD run times (exec / console): 301 / 4 ms
Jumping to boot code at 0x00002570(0xffed9000)
CBFS: Found '1-cpu.dtb' @0x1f700 size 0x24e7 in mcache @0x[   11.316087395,5] OPAL v6.3-rc2-32-gfa060c2c starting...
[   11.316090956,7] initial console log level: memory 7, driver 5
[   11.316092992,6] CPU: P9 generation processor (max 4 threads/core)
[   11.316094803,7] CPU: Boot CPU PIR is 0x0048 PVR is 0x004e1201
[   11.316097339,7] OPAL table: 0x30108230 .. 0x30108810, branch table: 0x30002000
[   11.316100347,7] Assigning physical memory map table for nimbus
[   11.316102736,7] FDT: Parsing fdt @0xf90a55f8
[   11.316800876,6] CHIP: Initialised chip 0 from xscom@603fc00000000
[   11.316824541,6] P9 DD2.10 detected
[   11.316826117,5] CHIP: Chip ID 0000 type: P9N DD2.10
[   11.316828148,7] XSCOM: Base address: 0x603fc00000000
[   11.316836786,7] XSTOP: XSCOM addr = 0x5012000, FIR bit = 31
[   11.316838932,6] MFSI 0:0: Initialized
[   11.316840345,6] MFSI 0:2: Initialized
[   11.316841739,6] MFSI 0:1: Initialized
[   11.316895086,6] LPC: LPC[000]: Initialized
[   11.316896594,7] LPC: access via MMIO @0x6030000000000
[   11.316909317,7] LPC: Default bus on chip 0x0
[   11.316916169,7] CPU: New max PIR set to 0x57
[   11.316946454,6] MEM: parsing reserved memory from node /reserved-memory
[   11.316950682,7] HOMER: Init chip 0
[   11.316952881,7]   PBA BAR0 : 0x00000001ff400000
[   11.316954582,7]   PBA MASK0: 0x0000000000300000
[   11.316956252,7]   HOMER Image at 0x1ff400000 size 4MB
[   11.316959172,7]   PBA BAR2 : 0x00000001ff800000
[   11.316960824,7]   PBA MASK2: 0x0000000000700000
[   11.316962423,7]   OCC Common Area at 0x1ff800000 size 8MB
[   11.316964798,7] CPU: decrementer bits 56
[   11.316969431,6] CPU: CPU from DT PIR=0x0048 Server#=0x48 State=3
[   11.316974784,6] CPU:  4 secondary threads
[   11.316977861,6] CPU: CPU from DT PIR=0x004c Server#=0x4c State=3
[   11.316983347,6] CPU:  4 secondary threads
[   11.316986412,6] CPU: CPU from DT PIR=0x0050 Server#=0x50 State=3
[   11.316991947,6] CPU:  4 secondary threads
[   11.316994932,6] CPU: CPU from DT PIR=0x0054 Server#=0x54 State=3
[   11.317000364,6] CPU:  4 secondary threads
[   11.317047524,6] PLAT: AST SIO unavailable!
[   11.317059862,7] UART: Using LPC IRQ 4
[   11.319105444,5] PLAT: Detected Talos platform
[   11.319151454,5] PLAT: Detected BMC platform ast2500:openbmc
[   11.334124617,5] CPU: All 16 processors called in...
[   11.427019479,7] LPC: Routing irq 10, policy: 0 (r=1)
[   11.427020555,7] LPC: SerIRQ 10 using route 0 targetted at OPAL
[   12.432331912,5] HIOMAP: Negotiated hiomap protocol v2
[   12.432387547,5] HIOMAP: Block size is 4KiB
[   12.432424576,5] HIOMAP: BMC suggested flash timeout of 8s
[   12.432474961,5] HIOMAP: Flash size is 64MiB
[   12.432502644,5] HIOMAP: Erase granule size is 4KiB
[   14.442312421,5] FLASH: Found system flash: (unnamed) id:0
[   15.233039828,5] STB: secure boot not supported
[   15.233096459,5] STB: trusted boot not supported
[   15.233136113,7] LPC: Routing irq 4, policy: 0 (r=1)
[   15.233137052,7] LPC: SerIRQ 4 using route 1 targetted at OPAL
[   15.233262962,5] OCC: All Chip Rdy after 0 ms
[   16.030488466,3] STB: container NOT VERIFIED, resource_id=4 secureboot not yet initialized
[   17.148258312,3] STB: container NOT VERIFIED, resource_id=3 secureboot not yet initialized
[   17.339780076,3] CAPP: Error loading ucode lid. index=201d1
Timeout, server

Something kills sockets including ssh connection and scat's. Not sure how to obtain useful debug reports at this point


Hostboot

Console 1:

root@talos:~# mboxctl --backend vpnor
Failed to post message: Connection timed out
root@talos:~# obmcutil status
CurrentBMCState     : xyz.openbmc_project.State.BMC.BMCState.Ready
CurrentPowerState   : xyz.openbmc_project.State.Chassis.PowerState.Off
CurrentHostState    : xyz.openbmc_project.State.Host.HostState.Off
BootProgress        : xyz.openbmc_project.State.Boot.Progress.ProgressStages.Unspecified
OperatingSystemState: xyz.openbmc_project.State.OperatingSystem.Status.OSStatus.Inactive
root@talos:~# mboxctl --lpc-state
LPC Bus Maps: Flash Device
root@talos:~# obmcutil poweron && obmc-console-client 

--== Welcome to Hostboot hostboot-a2ddbf3/hbicore.bin ==--

  3.07364|secure|SecureROM valid - enabling functionality
  5.45330|Booting from SBE side 0 on master proc=00050000
  5.48989|ISTEP  6. 5 - host_init_fsi
  5.66576|ISTEP  6. 6 - host_set_ipl_parms
  5.97213|ISTEP  6. 7 - host_discover_targets
  8.88173|HWAS|PRESENT> DIMM[03]=8000000000000000
  8.88174|HWAS|PRESENT> Proc[05]=8000000000000000
  8.88175|HWAS|PRESENT> Core[07]=00003C0000000000
  8.96734|ISTEP  6. 8 - host_update_master_tpm
  9.47789|SECURE|Security Access Bit> 0x0000000000000000
  9.47789|SECURE|Secure Mode Disable (via Jumper)> 0x8000000000000000
  9.47804|ISTEP  6. 9 - host_gard
  9.78122|HWAS|Blocking Speculative Deconfig
  9.78466|HWAS|FUNCTIONAL> DIMM[03]=8000000000000000
  9.78467|HWAS|FUNCTIONAL> Proc[05]=8000000000000000
  9.78468|HWAS|FUNCTIONAL> Core[07]=00003C0000000000
  9.78753|ISTEP  6.11 - host_start_occ_xstop_handler
 10.72636|ISTEP  6.12 - host_voltage_config
 10.85024|ISTEP  7. 1 - mss_attr_cleanup
 11.52064|ISTEP  7. 2 - mss_volt
 11.71774|ISTEP  7. 3 - mss_freq
 12.03147|ISTEP  7. 4 - mss_eff_config
 12.71145|ISTEP  7. 5 - mss_attr_update
 12.72369|ISTEP  8. 1 - host_slave_sbe_config
 12.92026|ISTEP  8. 2 - host_setup_sbe
 12.92147|ISTEP  8. 3 - host_cbs_start
 12.92278|ISTEP  8. 4 - proc_check_slave_sbe_seeprom_complete
 12.94392|ISTEP  8. 5 - host_attnlisten_proc
 12.97281|ISTEP  8. 6 - host_p9_fbc_eff_config
 12.97709|ISTEP  8. 7 - host_p9_eff_config_links
 13.02023|ISTEP  8. 8 - proc_attr_update
 13.02148|ISTEP  8. 9 - proc_chiplet_fabric_scominit
 13.04029|ISTEP  8.10 - proc_xbus_scominit
 13.05617|ISTEP  8.11 - proc_xbus_enable_ridi
 13.06652|ISTEP  8.12 - host_set_voltages
 13.08159|ISTEP  9. 1 - fabric_erepair
 13.17225|ISTEP  9. 2 - fabric_io_dccal
 13.17476|ISTEP  9. 3 - fabric_pre_trainadv
 13.18056|ISTEP  9. 4 - fabric_io_run_training
 13.18279|ISTEP  9. 5 - fabric_post_trainadv
 13.18490|ISTEP  9. 6 - proc_smp_link_layer
 13.18884|ISTEP  9. 7 - proc_fab_iovalid
 13.34018|ISTEP  9. 8 - host_fbc_eff_config_aggregate
 13.35707|ISTEP 10. 1 - proc_build_smp
 13.45260|ISTEP 10. 2 - host_slave_sbe_update
 14.42518|ISTEP 10. 4 - proc_cen_ref_clk_enable
 14.45122|ISTEP 10. 5 - proc_enable_osclite
 14.45240|ISTEP 10. 6 - proc_chiplet_scominit
 14.50113|ISTEP 10. 7 - proc_abus_scominit
 14.50253|ISTEP 10. 8 - proc_obus_scominit
 14.50391|ISTEP 10. 9 - proc_npu_scominit
 14.52143|ISTEP 10.10 - proc_pcie_scominit
 14.56398|ISTEP 10.11 - proc_scomoverride_chiplets
 14.56679|ISTEP 10.12 - proc_chiplet_enable_ridi
 14.56919|ISTEP 10.13 - host_rng_bist
 14.57204|ISTEP 10.14 - host_update_redundant_tpm
 14.57507|ISTEP 11. 1 - host_prd_hwreconfig
 14.60615|ISTEP 11. 2 - cen_tp_chiplet_init1
 14.60884|ISTEP 11. 3 - cen_pll_initf
 14.61212|ISTEP 11. 4 - cen_pll_setup
 14.61514|ISTEP 11. 5 - cen_tp_chiplet_init2
 14.61914|ISTEP 11. 6 - cen_tp_arrayinit
 14.62178|ISTEP 11. 7 - cen_tp_chiplet_init3
 14.62445|ISTEP 11. 8 - cen_chiplet_init
 14.62902|ISTEP 11. 9 - cen_arrayinit
 14.63220|ISTEP 11.10 - cen_initf
 14.63619|ISTEP 11.11 - cen_do_manual_inits
 14.63882|ISTEP 11.12 - cen_startclocks
 14.64145|ISTEP 11.13 - cen_scominits
 14.64429|ISTEP 12. 1 - mss_getecid
 14.90357|ISTEP 12. 2 - dmi_attr_update
 14.90622|ISTEP 12. 3 - proc_dmi_scominit
 14.95891|ISTEP 12. 4 - cen_dmi_scominit
 14.96148|ISTEP 12. 5 - dmi_erepair
 15.01996|ISTEP 12. 6 - dmi_io_dccal
 15.02203|ISTEP 12. 7 - dmi_pre_trainadv
 15.02533|ISTEP 12. 8 - dmi_io_run_training
 15.03893|ISTEP 12. 9 - dmi_post_trainadv
 15.04160|ISTEP 12.10 - proc_cen_framelock
 15.04924|ISTEP 12.11 - host_startprd_dmi
 15.05199|ISTEP 12.12 - host_attnlisten_memb
 15.05499|ISTEP 12.13 - cen_set_inband_addr
 15.05800|ISTEP 13. 1 - host_disable_memvolt
 15.20711|ISTEP 13. 2 - mem_pll_reset
 15.23835|ISTEP 13. 3 - mem_pll_initf
 15.26155|ISTEP 13. 4 - mem_pll_setup
 15.28433|ISTEP 13. 6 - mem_startclocks
 15.29444|ISTEP 13. 7 - host_enable_memvolt
 15.30526|ISTEP 13. 8 - mss_scominit
 15.60299|ISTEP 13. 9 - mss_ddr_phy_reset
 15.64624|ISTEP 13.10 - mss_draminit
 15.72113|ISTEP 13.11 - mss_draminit_training
 16.07293|ISTEP 13.12 - mss_draminit_trainadv
 16.11629|ISTEP 13.13 - mss_draminit_mc
 16.14663|ISTEP 14. 1 - mss_memdiag
 18.15379|ISTEP 14. 2 - mss_thermal_init
 18.18424|ISTEP 14. 3 - proc_pcie_config
 18.23177|ISTEP 14. 4 - mss_power_cleanup
 18.23571|ISTEP 14. 5 - proc_setup_bars
 18.26319|ISTEP 14. 6 - proc_htm_setup
 18.27305|ISTEP 14. 7 - proc_exit_cache_contained
 18.34568|ISTEP 15. 1 - host_build_stop_image
 20.11136|================================================
 20.20401|Error reported by fapi2 (0x3300) EID 0x9000067F
 20.20527|  No WOF table match found
 20.20527|  ModuleId   0x10 fapi2::MOD_FAPI2_PLAT_PARSE_WOF_TABLES
 20.20528|  ReasonCode 0x332d fapi2::RC_WOF_TABLE_NOT_FOUND
 20.20529|  UserData1  Number of cores : 0x0004000200000064
 20.20530|  UserData2  WOF Power Mode (1=Nominal, 2=Turbo) : 0x00000bb800000012
 20.20531|------------------------------------------------
 20.20531|  Callout type             : Procedure Callout
 20.20532|  Procedure                : EPUB_PRC_HB_CODE
 20.20533|  Priority                 : SRCI_PRIORITY_HIGH
 20.20533|------------------------------------------------
 20.20534|  Callout type             : Hardware Callout
 20.20535|  Target                   : Physical:/Sys0/Node0/Proc0
 20.20536|  Deconfig State           : NO_DECONFIG
 20.20537|  GARD Error Type          : GARD_NULL
 20.20537|  Priority                 : SRCI_PRIORITY_MED
 20.20538|------------------------------------------------
 20.20539|  Hostboot Build ID: hostboot-a2ddbf3/hbicore.bin
 20.20539|================================================
 20.43398|ISTEP 15. 2 - proc_set_pba_homer_bar
 20.51313|ISTEP 15. 3 - host_establish_ex_chiplet
 20.53945|ISTEP 15. 4 - host_start_stop_engine
 20.58165|ISTEP 16. 1 - host_activate_master
 21.76023|ISTEP 16. 2 - host_activate_slave_cores
 21.85551|ISTEP 16. 3 - host_secure_rng
 21.82968|ISTEP 16. 4 - mss_scrub
 21.89671|ISTEP 16. 5 - host_load_io_ppe
 21.90297|ISTEP 16. 6 - host_ipl_complete
 22.31002|ISTEP 18.11 - proc_tod_setup
 22.48599|ISTEP 18.12 - proc_tod_init
 22.48042|ISTEP 20. 1 - host_load_payload
 23.13836|ISTEP 20. 2 - host_load_hdat
 23.93272|ISTEP 21. 1 - host_runtime_setup
 25.85049|================================================
 25.85050|Error reported by fapi2 (0x3300) EID 0x900006AE
 25.85051|  No WOF table match found
 25.85051|  ModuleId   0x10 fapi2::MOD_FAPI2_PLAT_PARSE_WOF_TABLES
 25.85052|  ReasonCode 0x332d fapi2::RC_WOF_TABLE_NOT_FOUND
 25.85053|  UserData1  Number of cores : 0x0004000200000064
 25.85054|  UserData2  WOF Power Mode (1=Nominal, 2=Turbo) : 0x00000bb800000012
 25.85055|------------------------------------------------
 25.85056|  Callout type             : Procedure Callout
 25.85057|  Procedure                : EPUB_PRC_HB_CODE
 25.85057|  Priority                 : SRCI_PRIORITY_HIGH
 25.85058|------------------------------------------------
 25.85059|  Callout type             : Hardware Callout
 25.85060|  Target                   : Physical:/Sys0/Node0/Proc0
 25.85061|  Deconfig State           : NO_DECONFIG
 25.85062|  GARD Error Type          : GARD_NULL
 25.85062|  Priority                 : SRCI_PRIORITY_MED
 25.85063|------------------------------------------------
 25.85064|  Hostboot Build ID: hostboot-a2ddbf3/hbicore.bin
 25.85065|================================================
 29.33381|htmgt|OCCs are now running in ACTIVE state
 33.53957|ISTEP 21. 2 - host_verify_hdat
 33.57305|ISTEP 21. 3 - host_start_payload
[   34.314115459,5] OPAL skiboot-9858186 starting...
[   34.314118548,7] initial console log level: memory 7, driver 5
[   34.314120607,6] CPU: P9 generation processor (max 4 threads/core)
[   34.314122431,7] CPU: Boot CPU PIR is 0x004c PVR is 0x004e1201
[   34.314125156,7] OPAL table: 0x30103830 .. 0x30103e10, branch table: 0x30002000
[   34.314128287,7] Assigning physical memory map table for nimbus
[   34.314130825,7] Parsing HDAT...
[   34.314132060,7] SPIRA-S found.
[   34.314134429,6] BMC #0: HW version 3, SW version 2, chip DD1.0
[   34.314207313,6] SP Family is ibm,ast2500,openbmc
[   34.314213316,7] LPC: IOPATH chip id = 0
[   34.314214657,7] LPC: FW BAR       = f0000000
[   34.314216138,7] LPC: MEM BAR      = e0000000
[   34.314217618,7] LPC: IO BAR       = d0010000
[   34.314219073,7] LPC: Internal BAR = c0012000
[   34.314231626,7] LPC UART: base addr = 3f8 (3f8) size = 1 clk = 1843200, baud = 115200
[   34.314234283,7] LPC: BT [0, 0] sms_int: 0, bmc_int: 0
[   34.315419356,5] HDAT I2C: found e3p1 - unknown@1c dp:ff (ff:)
[   34.315507577,5] HDAT I2C: found e3p1 - unknown@1d dp:ff (ff:)
[   34.315551263,5] HDAT I2C: found e3p0 - unknown@19 dp:ff (ff:)
[   34.315590873,5] HDAT I2C: found e3p1 - unknown@1e dp:ff (ff:)
[   34.315633018,5] HDAT I2C: found e3p0 - unknown@1b dp:ff (ff:)
[   34.315684512,5] HDAT I2C: found e3p1 - unknown@1f dp:ff (ff:)
[   34.315740140,5] HDAT I2C: found e3p0 - unknown@1a dp:ff (ff:)
[   34.315789325,5] HDAT I2C: found e3p0 - unknown@18 dp:ff (ff:)
[   34.316498545,5] CHIP: Chip ID 0000 type: P9N DD2.10
[   34.316859747,5] PLAT: Detected Talos platform
[   34.316920122,5] PLAT: Detected BMC platform ast2500:openbmc
[   34.333688982,5] CPU: All 16 processors called in...
[   34.405637741,7] LPC: Routing irq 10, policy: 0 (r=1)
[   34.405638837,7] LPC: SerIRQ 10 using route 0 targetted at OPAL
[   36.001359755,5] HIOMAP: Negotiated hiomap protocol v2
[   36.001427914,5] HIOMAP: Block size is 64KiB
[   36.001462085,5] HIOMAP: BMC suggested flash timeout of 8s
[   36.001501325,5] HIOMAP: Flash size is 64MiB
[   36.001535504,5] HIOMAP: Erase granule size is 64KiB
[   38.113777347,5] FLASH: Found system flash: (unnamed) id:0
[   38.416563228,7] LPC: Routing irq 4, policy: 0 (r=1)
[   38.416564269,7] LPC: SerIRQ 4 using route 1 targetted at OPAL
[   38.416725892,5] OCC: All Chip Rdy after 0 ms
[   39.213932482,3] STB: VERSION verification FAILED. log=0xffffffffffff8160
[   40.331682088,3] STB: IMA_CATALOG verification FAILED. log=0xffffffffffff8160
[   41.011215283,3] CAPP: Error loading ucode lid. index=201d1
Timeout, server 

The whole reason i'm trying to get logs is to understand what happens around those times.

Then trying to poweroff hostboot requires something like the following, since reliable connection is impossible. Something is off either in configuration of Talos II (configured in Petitboot to obtain IP from DHCP which is obtained through static leases from router, which should not be a problem). So my only solution as of now is to combat this race condition to force power off of machine: while true; do ssh -o "ServerAliveInterval 5" root@talos.onion -p 2300 "obmcutil poweroff" && echo "poweroff was successfull at $(date)" && break;done

Ideally, logs would be kept on BMC, but neither hostboot or coreboot are able to get the logs from bmc perspective. I really do not understand how you behavior is different. Will be in same room of my test platform next week, but would love to have advice that would help any other tester: how to get usefull logs? Same behavior on my side on local lan connection.

Seems to be BMC fault killing sockets (empiric observation). But not confirmed on your side. Which is weird to me.