Open srcshelton opened 2 months ago
Further, on every reboot there's a ~20 second delay whilst starting the kernel:
[ 1.917445][ T1] pci 0000:00:02.2: PCI bridge to [bus 01]
[ 1.923128][ T1] pci 0000:00:02.2: bridge window [io 0x1000-0x1fff]
[ 1.929928][ T1] pci 0000:00:02.2: bridge window [mem 0xd0000000-0xd00fffff]
[ 1.937430][ T1] pci 0000:00:02.3: PCI bridge to [bus 02]
[ 1.943107][ T1] pci 0000:00:02.3: bridge window [io 0x2000-0x2fff]
[ 1.949911][ T1] pci 0000:00:02.3: bridge window [mem 0xd0100000-0xd01fffff]
[ 1.957412][ T1] pci 0000:00:02.4: PCI bridge to [bus 03]
[ 1.963082][ T1] pci 0000:00:02.4: bridge window [io 0x3000-0x3fff]
[ 1.969888][ T1] pci 0000:00:02.4: bridge window [mem 0xd0200000-0xd02fffff]
[ 1.977394][ T1] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window]
[ 1.984277][ T1] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window]
[ 1.991168][ T1] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000dffff]
[ 1.998143][ T1] pci_bus 0000:00: resource 7 [mem 0xd0000000-0xffffffff]
[ 2.005119][ T1] pci_bus 0000:01: resource 0 [io 0x1000-0x1fff]
[ 2.011394][ T1] pci_bus 0000:01: resource 1 [mem 0xd0000000-0xd00fffff]
[ 2.018371][ T1] pci_bus 0000:02: resource 0 [io 0x2000-0x2fff]
[ 2.024653][ T1] pci_bus 0000:02: resource 1 [mem 0xd0100000-0xd01fffff]
[ 2.031629][ T1] pci_bus 0000:03: resource 0 [io 0x3000-0x3fff]
[ 2.037904][ T1] pci_bus 0000:03: resource 1 [mem 0xd0200000-0xd02fffff]
[ 2.045962][ T1] pci 0000:00:13.0: PME# does not work under D3, disabling it
[ 2.053392][ T1] PCI: CLS 64 bytes, default 64
[ 2.058159][ T1] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 2.065934][ T1] pci 0000:00:02.0: Adding to iommu group 0
[ 2.071747][ T1] pci 0000:00:02.2: Adding to iommu group 1
[ 2.077573][ T1] pci 0000:00:02.3: Adding to iommu group 2
[ 2.083384][ T1] pci 0000:00:02.4: Adding to iommu group 3
[ 2.089183][ T1] pci 0000:00:08.0: Adding to iommu group 4
[ 2.094982][ T1] pci 0000:00:10.0: Adding to iommu group 5
[ 2.100788][ T1] pci 0000:00:11.0: Adding to iommu group 6
[ 2.106595][ T1] pci 0000:00:13.0: Adding to iommu group 7
[ 2.112425][ T1] pci 0000:00:14.0: Adding to iommu group 8
[ 2.118221][ T1] pci 0000:00:14.3: Adding to iommu group 8
[ 2.124017][ T1] pci 0000:00:14.7: Adding to iommu group 8
[ 2.129889][ T1] pci 0000:00:18.0: Adding to iommu group 9
[ 2.135711][ T1] pci 0000:00:18.1: Adding to iommu group 9
[ 2.141507][ T1] pci 0000:00:18.2: Adding to iommu group 9
[ 2.147305][ T1] pci 0000:00:18.3: Adding to iommu group 9
[ 2.153108][ T1] pci 0000:00:18.4: Adding to iommu group 9
[ 2.158900][ T1] pci 0000:00:18.5: Adding to iommu group 9
[ 2.164717][ T1] pci 0000:01:00.0: Adding to iommu group 10
[ 2.170633][ T1] pci 0000:02:00.0: Adding to iommu group 11
[ 2.176522][ T1] pci 0000:03:00.0: Adding to iommu group 12
[ 2.182641][ T1] pci 0000:00:00.2: can't derive routing for PCI INT A
[ 2.189355][ T1] pci 0000:00:00.2: PCI INT A: no GSI
[ 2.194616][ T1] AMD-Vi: Extended features (0x800290ad2, 0x0): PPR GT IA GA PC GA_vAPIC
[ 2.202905][ T1] AMD-Vi: Interrupt remapping enabled
[ 23.540739][ T1] ------------[ cut here ]------------
[ 23.546065][ T1] WARNING: CPU: 0 PID: 1 at drivers/iommu/amd/init.c:933 amd_iommu_enable_interrupts+0x5cf/0x8e0
[ 23.556438][ T1] Modules linked in:
[ 23.560210][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.3-gentoo #1
[ 23.567454][ T1] Hardware name: PC Engines apu2/apu2, BIOS v4.19.0.1 01/31/2023
[ 23.575037][ T1] RIP: 0010:amd_iommu_enable_interrupts+0x5cf/0x8e0
[ 23.581504][ T1] Code: fd 70 88 12 ba 0f 85 b3 fe ff ff 48 c7 c7 a0 0e e5 b9 83 0d c2 52 a4 00 01 e8 3d 57 9f ff 48 8b 15 c6 5a ed 00 e9 d3 fb ff ff <0f> 0b 4f
[ 23.600963][ T1] RSP: 0018:ffffaea140023dd0 EFLAGS: 00010246
[ 23.606898][ T1] RAX: 0000000000000000 RBX: 00000000001e8480 RCX: 0000000000000000
[ 23.614741][ T1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 23.622587][ T1] RBP: ffff88e340051800 R08: 0000000000000000 R09: 0000000000000000
[ 23.630429][ T1] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000080000000
[ 23.638272][ T1] R13: 000ffffffffffff8 R14: 0800000000000000 R15: 2000000000000000
[ 23.646113][ T1] FS: 0000000000000000(0000) GS:ffff88e36ac00000(0000) knlGS:0000000000000000
[ 23.654914][ T1] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 23.661365][ T1] CR2: ffff88e36efff000 CR3: 000000006100e000 CR4: 00000000000406f0
[ 23.669210][ T1] Call Trace:
[ 23.672378][ T1] <TASK>
[ 23.675192][ T1] ? __warn+0x6f/0x140
[ 23.679140][ T1] ? amd_iommu_enable_interrupts+0x5cf/0x8e0
[ 23.685000][ T1] ? report_bug+0x1cc/0x210
[ 23.689384][ T1] ? handle_bug+0x3f/0x70
[ 23.693586][ T1] ? exc_invalid_op+0x1b/0x1b0
[ 23.698224][ T1] ? asm_exc_invalid_op+0x1a/0x20
[ 23.703120][ T1] ? amd_iommu_enable_interrupts+0x5cf/0x8e0
[ 23.708971][ T1] ? iommu_setup+0x2b0/0x2b0
[ 23.713433][ T1] state_next+0x1103/0x23a0
[ 23.717812][ T1] ? iommu_setup+0x2b0/0x2b0
[ 23.722288][ T1] amd_iommu_init+0x1b/0x50
[ 23.726667][ T1] pci_iommu_init+0xa/0x40
[ 23.730957][ T1] do_one_initcall+0x6d/0x2d0
[ 23.735508][ T1] kernel_init_freeable+0x2cd/0x3d0
[ 23.740586][ T1] ? rest_init+0xa0/0xa0
[ 23.744702][ T1] kernel_init+0x17/0x330
[ 23.748905][ T1] ? rest_init+0xa0/0xa0
[ 23.753019][ T1] ret_from_fork+0x2c/0x40
[ 23.757307][ T1] ? rest_init+0xa0/0xa0
[ 23.761427][ T1] ret_from_fork_asm+0x11/0x20
[ 23.766061][ T1] </TASK>
[ 23.768956][ T1] ---[ end trace 0000000000000000 ]---
[ 23.774288][ T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 23.781440][ T1] software IO TLB: mapped [mem 0x00000000cbe94000-0x00000000cfe94000] (64MB)
(This may be a separate, unrelated issue but I thought I'd mention it here for the sake of completeness)
Component
Dasharo firmware
Device
PC Engines APU2
Dasharo version
PC Engines release v4.19.0.1
Dasharo Tools Suite version
No response
Test case ID
No response
Brief summary
"Hardware Error" reported by Linux (all versions from 5.0 to 6.10)
How reproducible
Every couple of hours or so
How to reproduce
Boot a linux kernel
Expected behavior
No hardware errors reported
Actual behavior
Frequent error reports - albeit all "corrected"
Screenshots
The
Error Addr
changes, the remainder of the information is static.Additional context
This may not be a firmware issue or firmware-correctable, but I thought I'd ask in case anything can be done in the new Dasharo firmware releases?
Solutions you've tried
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1575932.html
… seems to describe a very similar problem, but was resolved many years ago, and booting with
nopti
had no effect.