[ 2687.794321] opal: Hardware platform error: Unrecoverable HMI exception

open-power / skiboot

OPAL boot and runtime firmware for POWER

Apache License 2.0

99 stars 134 forks source link

By running TOD error recovery stress test, hitting a platform error followed by a system reboot.

[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:12 T:1: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:12 T:2: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:22 T:1: TFMR(2a12000980a94000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:7 T:0: TFMR(2a12000980a85000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:15 T:1: TFMR(2a12000980a94000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:23 T:3: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:15 T:2: TFMR(2a12000980a94000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:23 T:2: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:23 T:1: TFMR(2a12000980a85000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:23 T:0: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:20 T:0: TFMR(2a12000980a85000) Timer Facility Error
[    1.039373791,3] CHIPTOD: TB "Not Set" TOD in error state
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:13 T:2: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:20 T:3: TFMR(2a12000980a85000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:20 T:1: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:20 T:2: TF2000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:13 T:1: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:14 T:2: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[    1.039373791,4] HMI: Failed to get TB in running state! CPU=85f, TFMR=2a12800980a04000
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:14 T:3: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:14 T:0: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:14 T:1: TFMR(2a12000980a84000) Timer Facility Error
[    1.039373791,4] HMI: Failed to get TB in running state! CPU=85d, TFMR=2a12800980a04000
[    1.039373791,4] HMI: Failed to get TB in running state! CPU=85c, TFMR=2a12800980a04000
[    1.039373791,4] HMI: Failed to get TB in running state! CPU=85e, TFMR=2a12800980a04000
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:6 T:1: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:6 T:0: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:6 T:2: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:7 T:2: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:7 T:1: TFMR(2a12000980a84000) Timer Facility Error
[ 2853.104254431,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000
[ 2853.104254431,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc1]: P:8 C:7 T:3: TFMR(2a12000980a84000) Timer Facility Error
[ 2687.794321] opal: Hardware platform error: Unrecoverable HMI exception
[ 2688.909240] WARNING: CPU: 118 PID: 904 at /build/linux-GKZ1fU/linux-4.15.0/kernel/sched/core.c:1189 set_task_cpu+0x240/0x250
[ 2688.909365] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables devlink iptable_filter input_leds joydev idt_89hpesx mac_hid ofpart cmdlinepart powernv_flash at24 ipmi_powernv ipmi_devintf uio_pdrv_genirq ipmi_msghandler mtd uio opal_prd vmx_crypto ibmpowernv kvm_hv kvm sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure ast i2c_algo_bit hid_generic
[ 2688.910069]  ttm mpt3sas drm_kms_helper syscopyarea sysfillrect sysimgblt usbhid fb_sys_fops nvme hid crct10dif_vpmsum raid_class crc32c_vpmsum drm nvme_core i40e aacraid scsi_transport_sas
[ 2688.910249] CPU: 118 PID: 904 Comm: kworker/118:1 Not tainted 4.15.0-18-generic #19-Ubuntu
[ 2688.910327] Workqueue: events hmi_event_handler
[ 2688.910377] NIP:  c00000000014d6e0 LR: c00000000014e30c CTR: c00000000015a240
[ 2688.910450] REGS: c000203974b26f50 TRAP: 0700   Not tainted  (4.15.0-18-generic)
[ 2688.910536] MSR:  900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28008444  XER: 00000000
[ 2688.910634] CFAR: c00000000014d54c SOFTE: 0 
[ 2688.910634] GPR00: c00000000014e30c c000203974b271d0 c0000000016eae00 c000003e982d5900 
[ 2688.910634] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[ 2688.910634] GPR08: c000000001721ee0 0000000000000000 0000000000000000 0000000000005d04 
[ 2688.910634] GPR12: 0000000028008244 c00000000fad1200 c00000000013c788 c0002039759510c0 
[ 2688.910634] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000203993860000 
[ 2688.910634] GPR20: c000203994a235c0 0000000000000000 0000000000000000 c000203974b27350 
[ 2688.910634] GPR24: c000003e982d5d28 c00000000171dd78 c0000000011d8580 0000000000000000 
[ 2688.910634] GPR28: 0000000000000004 0000000000000000 0000000000000000 c000003e982d5900 
[ 2688.911257] NIP [c00000000014d6e0] set_task_cpu+0x240/0x250
[ 2688.911307] LR [c00000000014e30c] try_to_wake_up+0x1bc/0x660
[ 2688.911367] Call Trace:
[ 2688.911397] [c000203974b271d0] [c0000000011d8580] runqueues+0x0/0xc00 (unreliable)
[ 2688.911472] [c000203974b27210] [c00000000014e30c] try_to_wake_up+0x1bc/0x660
[ 2688.911548] [c000203974b27290] [c0000000001725d8] autoremove_wake_function+0x28/0x70
[ 2688.911622] [c000203974b272c0] [c000000000171b60] __wake_up_common+0xd0/0x200
[ 2688.911697] [c000203974b27330] [c000000000171d4c] __wake_up_common_lock+0xbc/0x110
[ 2688.911771] [c000203974b273c0] [c00000000018ea40] wake_up_klogd_work_func+0x60/0xc0
[ 2688.911846] [c000203974b273f0] [c000000000295d10] irq_work_run_list+0xb0/0x100
[ 2688.911937] [c000203974b27430] [c0000000001b7220] update_process_times+0x60/0x90
[ 2688.912011] [c000203974b27460] [c0000000001cef54] tick_sched_handle.isra.5+0x34/0xd0
[ 2688.912084] [c000203974b27490] [c0000000001cf050] tick_sched_timer+0x60/0xe0
[ 2688.912158] [c000203974b274d0] [c0000000001b7db4] __hrtimer_run_queues+0x144/0x370
[ 2688.912232] [c000203974b27550] [c0000000001b8d0c] hrtimer_interrupt+0xfc/0x350
[ 2688.912307] [c000203974b27620] [c0000000000248f0] __timer_interrupt+0x90/0x260
[ 2688.912382] [c000203974b27670] [c000000000024d08] timer_interrupt+0x98/0xe0
[ 2688.912446] [c000203974b276a0] [c000000000009014] decrementer_common+0x114/0x120
[ 2688.912522] --- interrupt: 901 at replay_interrupt_return+0x0/0x4
[ 2688.912522]     LR = arch_local_irq_restore+0x74/0x90
[ 2688.912619] [c000203974b27990] [c000203974b279d0] 0xc000203974b279d0 (unreliable)
[ 2688.912695] [c000203974b279b0] [c000000000cff180] _raw_spin_unlock_irqrestore+0x40/0xa0
[ 2688.912770] [c000203974b279d0] [c00000000057d5fc] pstore_dump+0x31c/0x3e0
[ 2688.918818] [c000203974b27b10] [c00000000018ec64] kmsg_dump+0x134/0x1a0
[ 2688.925755] [c000203974b27b70] [c0000000000a2f14] pnv_platform_error_reboot+0x94/0x110
[ 2688.934081] [c000203974b27be0] [c0000000000a842c] hmi_event_handler+0x1bc/0x1c0
[ 2688.941023] [c000203974b27c90] [c000000000133958] process_one_work+0x298/0x5a0
[ 2688.947953] [c000203974b27d20] [c000000000133cf8] worker_thread+0x98/0x630
[ 2688.954876] [c000203974b27dc0] [c00000000013c928] kthread+0x1a8/0x1b0
[ 2688.961814] [c000203974b27e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4
[ 2688.968751] Instruction dump:
[ 2688.971525] 7faa3670 7d4a0194 57a706be 7d4a07b4 794a1f24 7d28502a 7d293c36 71290001 
[ 2688.979834] 4082fe80 60000000 60000000 60420000 <0fe00000> 4bfffe6c 60000000 60420000 
[ 2688.988160] ---[ end trace c3fbb130bac0f441 ]---
[ 2855.374688141,0] OPAL: Reboot requested due to Platform error.
[ 2855.377232501,3] OPAL: Reboot requested due to Platform error.[ 2855.380086941,5] Software initiated checkstop disabled.
[ 2689.3640[ 2855.383112058,5] OPAL: Reboot request...
36] opal: Reboot type 1 not supported for Unrecoverable HMI exception

--== Welcome to Hostboot  ==--

  2.77291|secure|SecureROM valid - enabling functionality
  5.37968|Ignoring boot flags, incorrect version 0x0
  5.38625|Booting from SBE side 0 on master proc=00050000
  5.58026|ISTEP  6. 5 - host_init_fsi
  5.68847|ISTEP  6. 6 - host_set_ipl_parms
  5.94385|ISTEP  6. 7 - host_discover_targets
  7.74862|HWAS|PRESENT> DIMM[03]=F0F0000000000000
  7.74863|HWAS|PRESENT> Proc[05]=8800000000000000
  7.74864|HWAS|PRESENT> Core[07]=03FFCFCFCF0F0000
  7.94847|ISTEP  6. 8 - host_update_master_tpm
 15.48415|SECURE|Security Access Bit> 0xC000000000000000

OPAL level:

[   70.104446560,5] OPAL skiboot-v5.11-70-g5307c0ec7899-pc34e21f starting...

[console-pexpect]# [console-pexpect]#[ 1.374114245,3] CHIPTOD: TB "Not Set" TOD in error state [18014398511.108890053,4] HMI: Failed to get TB in running state! CPU=41, TFMR=2a12800980a44000 [18014398511.108890053,4] HMI: Failed to get TB in running state! CPU=43, TFMR=2a12800980a44000 [18014398511.108890053,4] HMI: Failed to get TB in running state! CPU=40, TFMR=2a12800980a44000 [18014398511.108890053,4] HMI: Failed to get TB in running state! CPU=42, TFMR=2a12800980a44000 [66901.841558] opal: Hardware platform error: Unrecoverable HMI exception [66903.108438] WARNING: CPU: 42 PID: 826 at /build/linux-GKZ1fU/linux-4.15.0/kernel/sched/core.c:1189 set_task_cpu+0x240/0x250 [66903.108541] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter devlink ip6_tables iptable_filter joydev input_leds mac_hid ofpart idt_89hpesx cmdlinepart powernv_flash mtd ipmi_powernv ipmi_devintf vmx_crypto at24 uio_pdrv_genirq uio ipmi_msghandler ibmpowernv opal_prd kvm_hv kvm sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure ast i2c_algo_bit ttm hid_generic [66903.109196] drm_kms_helper mpt3sas syscopyarea sysfillrect sysimgblt fb_sys_fops nvme usbhid crct10dif_vpmsum raid_class hid crc32c_vpmsum drm nvme_core i40e aacraid scsi_transport_sas [66903.109352] CPU: 42 PID: 826 Comm: kworker/42:1 Not tainted 4.15.0-18-generic #19-Ubuntu [66903.109424] Workqueue: events hmi_event_handler [66903.109471] NIP: c00000000014d6e0 LR: c00000000014e30c CTR: c00000000015a240 [66903.109538] REGS: c0000000fe66af50 TRAP: 0700 Not tainted (4.15.0-18-generic) [66903.109603] MSR: 9000000002823033 <SF,HV,VEC,VSX,FP,ME,IR,DR,RI,LE> CR: 28008424 XER: 00000000 [66903.109707] CFAR: c00000000014d54c SOFTE: 0 [66903.109707] GPR00: c00000000014e30c c0000000fe66b1d0 c0000000016eae00 c00020395bcac900 [66903.109707] GPR04: 0000000000000040 0000000000000040 0000000000000000 0000000000000000 [66903.109707] GPR08: c000000001721ee0 0000000000000000 0000000000000008 00000000000748d0 [66903.109707] GPR12: 0000000028008224 c00000000fa9ce00 c00000000013c788 c000003fe0b0ca80 [66903.109707] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000003fee840000 [66903.109707] GPR20: c000003fefa035c0 0000000000000000 0000000000000000 c0000000fe66b350 [66903.109707] GPR24: c00020395bcacd28 c00000000171dd78 c0000000011d8580 0000000000000000 [66903.109707] GPR28: 0000000000000004 0000000000000040 0000000000000040 c00020395bcac900 [66903.110286] NIP [c00000000014d6e0] set_task_cpu+0x240/0x250 [66903.110332] LR [c00000000014e30c] try_to_wake_up+0x1bc/0x660 [66903.110387] Call Trace: [66903.110415] [c0000000fe66b1d0] [c0000000011d8580] runqueues+0x0/0xc00 (unreliable) [66903.110484] [c0000000fe66b210] [c00000000014e30c] try_to_wake_up+0x1bc/0x660 [66903.110554] [c0000000fe66b290] [c0000000001725d8] autoremove_wake_function+0x28/0x70 [66903.110622] [c0000000fe66b2c0] [c000000000171b60] __wake_up_common+0xd0/0x200 [66903.110691] [c0000000fe66b330] [c000000000171d4c] __wake_up_common_lock+0xbc/0x110 [66903.110760] [c0000000fe66b3c0] [c00000000018ea40] wake_up_klogd_work_func+0x60/0xc0 [66903.110830] [c0000000fe66b3f0] [c000000000295d10] irq_work_run_list+0xb0/0x100 [66903.110901] [c0000000fe66b430] [c0000000001b7220] update_process_times+0x60/0x90 [66903.110970] [c0000000fe66b460] [c0000000001cef54] tick_sched_handle.isra.5+0x34/0xd0 [66903.111037] [c0000000fe66b490] [c0000000001cf050] tick_sched_timer+0x60/0xe0 [66903.111120] [c0000000fe66b4d0] [c0000000001b7db4] __hrtimer_run_queues+0x144/0x370 [66903.111188] [c0000000fe66b550] [c0000000001b8d0c] hrtimer_interrupt+0xfc/0x350 [66903.111258] [c0000000fe66b620] [c0000000000248f0] __timer_interrupt+0x90/0x260 [66903.111327] [c0000000fe66b670] [c000000000024d08] timer_interrupt+0x98/0xe0 [66903.111387] [c0000000fe66b6a0] [c000000000009014] decrementer_common+0x114/0x120 [66903.111460] --- interrupt: 901 at replay_interrupt_return+0x0/0x4 [66903.111460] LR = arch_local_irq_restore+0x74/0x90 [66903.111551] [c0000000fe66b990] [c0000000fe66b9d0] 0xc0000000fe66b9d0 (unreliable) [66903.111622] [c0000000fe66b9b0] [c000000000cff180] _raw_spin_unlock_irqrestore+0x40/0xa0 [66903.111691] [c0000000fe66b9d0] [c00000000057d5fc] pstore_dump+0x31c/0x3e0 [66903.118016] [c0000000fe66bb10] [c00000000018ec64] kmsg_dump+0x134/0x1a0 [66903.124952] [c0000000fe66bb70] [c0000000000a2f14] pnv_platform_error_reboot+0x94/0x110 [66903.131898] [c0000000fe66bbe0] [c0000000000a842c] hmi_event_handler+0x1bc/0x1c0 [66903.140209] [c0000000fe66bc90] [c000000000133958] process_one_work+0x298/0x5a0 [66903.147152] [c0000000fe66bd20] [c000000000133cf8] worker_thread+0x98/0x630 [66903.154074] [c0000000fe66bdc0] [c00000000013c928] kthread+0x1a8/0x1b0 [66903.159636] [c0000000fe66be30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4 [66903.167942] Instruction dump: [66903.170729] 7faa3670 7d4a0194 57a706be 7d4a07b4 794a1f24 7d28502a 7d293c36 71290001 [66903.177660] 4082fe80 60000000 60000000 60420000 <0fe00000> 4bfffe6c 60000000 60420000 [66903.185965] ---[ end trace 739c2a2d69ccdc2b ]--- [67070.201983351,0] OPAL: Reboot requested due to Platform error. [67070.205232792,3] OPAL: Reboot requested due to Platform error.[67070.208087613,5] Software initiated checkstop disabled. [6[67070.210438942,5] OPAL: Reboot request... 6903.561762] opal: Reboot type 1 not supported for Unrecoverable HMI exception

[ 90.410845] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.416395] ErroLOCK ERROR: Unlocked non-owned lock @0x30303698 (state: 0x0000004000000001) [18014398766.211691469,7] HMI: Received HMI interrupt: HMER = 0x0840000000000000 [18014398766.211691469,7] HMI: [Loc: UOPWR.1302LDA-Node0-Proc0]: P:0 C:16 T:3: TFMR(2a12000980a44000) Timer Facility Error [ 256.481751932,3] OPAL exiting with locks held, token=145 retval=0 [ 256.481756229,3] core/lock.c:216 [ 256.481758491,3] core/lock.c:216 [18014398766.211691469,0] Aborting! r detail: TiCPU 0040 Backtrace: S: 0000000031e07910 R: 000000003001a4f8 E ._abort+0x4c S: 0000000031e07990 R: 0000000030017e88 E .lock_error+0x64 S: 0000000031e07a10 R: 0000000030017988 E .unlock+0x60 S: 0000000031e07a80 R: 0000000030017b40 E .lock_caller+0xf4 S: 0000000031e07b30 R: 0000000030039b60 E .__uart_do_poll+0x40 S: 0000000031e07c20 R: 000000003001ba70 E .opal_run_pollers+0x168 S: 0000000031e07ca0 R: 000000003001bafc E .opal_poll_events+0x74 S: 0000000031e07d20 R: 00000000300051e4 E opal_entry+0x134 --- OPAL call token: 0xa caller R1: 0xc000003fe6a77a30 --- mer facility experienced an error [ 90.422007] HMER: 0840000000000000 [ 90.426216] TFMR: 2a12000980a44000 [ 90.428871] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.435800] Error detail: Timer facility experienced an error [ 90.441469] HMER: 0840000000000000 [ 90.444120] TFMR: 2a12000980a54000 [ 90.448281] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.453833] Error detail: Timer facility experienced an error [ 90.459507] HMER: 0840000000000000 [ 90.463537] TFMR: 2a12000980a54000 [ 90.466319] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.473242] Error detail: Timer facility experienced an error [ 90.478941] HMER: 0840000000000000 [ 90.481570] TFMR: 2a12000980a44000 [ 90.485722] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.491279] Error detail: Timer facility experienced an error [ 90.496955] HMER: 0840000000000000 [ 9[ 256.476915661,0] Assert fail: core/mem_region.c:444:lock_held_by_me(&region->free_list_lock) 0.500979] TFMR: 2a12000980a44000 [ 90.503752] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.509306] Error detail: Timer facility experienced an error [ 90.516383] HMER: 0840000000000000 [ 90.519016] TFMR: 2a12000980a44000 [ 90.523163] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.528719] Error detail: Timer facility experienced an error [ 90.534420] HMER: 0840000000000000 [ 90.538418] TFMR: 2a12000980a54000 [ 90.541197] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.546747] Error detail: Timer facility experienced an error [ 90.553675] HMER: 0840000000000000 [ 90.556448] TFMR: 2a12000980a44000 [ 90.560606] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.566279] Error detail: Timer facility experienced an error [ 90.571708] HMER: 0840000000000000 [ 90.574484] TFMR: 2a12000980a54000 [ 90.578635] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.584245] Error detail: Timer facility experienced an error [ 90.589738] HMER: 0840000000000000 [ 90.593888] TFMR: 2a12000980a54000 [ 90.596664] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.603734] Error detail: Timer facility experienced an error [ 90.609148] HMER: 0840000000000000 [ 90.611916] TFMR: 2a12000980a44000 [ 90.616103] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.621632] Error detail: Timer facility experienced an error [ 90.627178] HMER: 0840000000000000 [ 90.631331] TFMR: 2a12000980a44000 [ 90.634145] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.641035] Error detail: Timer facility experienced an error [ 90.646593] HMER: 0840000000000000 [ 90.649362] TFMR: 2a12000980a54000 [ 90.653689] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.659068] Error detail: Timer facility experienced an error [ 90.664621] HMER: 0840000000000000 [ 90.668769] TFMR: 2a12000980a44000 [ 90.671584] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.677099] Error detail: Timer facility experienced an error [ 90.684024] HMER: 0840000000000000 [ 90.686806] TFMR: 2a12000980a44000 [ 90.691016] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.696514] Error detail: Timer facility experienced an error [ 90.702060] HMER: 0840000000000000 [ 90.706207] TFMR: 2a12000980a54000 [ 90.708994] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.714674] Error detail: Timer facility experienced an error [ 90.720088] HMER: 0840000000000000 [ 90.724239] TFMR: 2a12000980a44000 [ 90.727020] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.733942] Error detail: Timer facility experienced an error [ 90.739503] HMER: 0840000000000000 [ 90.742275] TFMR: 2a12000980a44000 [ 90.746464] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.751982] Error detail: Timer facility experienced an error [ 90.757531] HMER: 0840000000000000 [ 90.761680] TFMR: 2a12000980a44000 [ 90.764552] Severe Hypervisor Maintenance interrupt [Recovered] [ 90.771385] [ 256.476915661,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 0 (sptr=00001ccc) [18014398766.211691469,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 1 (sptr=00001ccc) [ 256.476915661,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 2 (sptr=00001ccc) [18014398514.386672589,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 0 (sptr=00002ccc) [18014398514.386672589,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 1 (sptr=00002ccc) [ 5.139896781,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 2 (sptr=00002ccc) [ 5.139896781,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 0 (sptr=00003ccc) [ 5.139896781,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 1 (sptr=00003ccc) [18014398514.386672589,3] HMI: Rendez-vous stage 1 timeout, CPU 0x43 waiting for thread 2 (sptr=00003ccc) [ 5.139896781,4] HMI: Failed to get TB in running state! CPU=43, TFMR=2a12000980a44000

open-power / skiboot

[ 2687.794321] opal: Hardware platform error: Unrecoverable HMI exception #174