Dasharo / dasharo-issues

The Dasharo issue tracker
https://dasharo.com/
25 stars 0 forks source link

APU2: "L1 TLB multimatch" #994

Open srcshelton opened 2 months ago

srcshelton commented 2 months ago

Component

Dasharo firmware

Device

PC Engines APU2

Dasharo version

PC Engines release v4.19.0.1

Dasharo Tools Suite version

No response

Test case ID

No response

Brief summary

"Hardware Error" reported by Linux (all versions from 5.0 to 6.10)

How reproducible

Every couple of hours or so

How to reproduce

Boot a linux kernel

Expected behavior

No hardware errors reported

Actual behavior

Frequent error reports - albeit all "corrected"

Screenshots

[301707.243573][T16741] [Hardware Error]: Corrected error, no action required.
[301707.250737][T16741] [Hardware Error]: CPU:0 (16:30:1) MC0_STATUS[Over|CE|-|AddrV|-|-|-]: 0xd400000000010015
[301707.260632][T16741] [Hardware Error]: Error Addr: 0x00007fc39a4d8000
[301707.267218][T16741] [Hardware Error]: MC0 Error: L1 TLB multimatch.
[301707.273701][T16741] [Hardware Error]: cache level: L1, tx: DATA

The Error Addr changes, the remainder of the information is static.

Additional context

This may not be a firmware issue or firmware-correctable, but I thought I'd ask in case anything can be done in the new Dasharo firmware releases?

Solutions you've tried

https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1575932.html

… seems to describe a very similar problem, but was resolved many years ago, and booting with nopti had no effect.

srcshelton commented 2 months ago

Further, on every reboot there's a ~20 second delay whilst starting the kernel:

[    1.917445][    T1] pci 0000:00:02.2: PCI bridge to [bus 01]
[    1.923128][    T1] pci 0000:00:02.2:   bridge window [io  0x1000-0x1fff]
[    1.929928][    T1] pci 0000:00:02.2:   bridge window [mem 0xd0000000-0xd00fffff]
[    1.937430][    T1] pci 0000:00:02.3: PCI bridge to [bus 02]
[    1.943107][    T1] pci 0000:00:02.3:   bridge window [io  0x2000-0x2fff]
[    1.949911][    T1] pci 0000:00:02.3:   bridge window [mem 0xd0100000-0xd01fffff]
[    1.957412][    T1] pci 0000:00:02.4: PCI bridge to [bus 03]
[    1.963082][    T1] pci 0000:00:02.4:   bridge window [io  0x3000-0x3fff]
[    1.969888][    T1] pci 0000:00:02.4:   bridge window [mem 0xd0200000-0xd02fffff]
[    1.977394][    T1] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    1.984277][    T1] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff window]
[    1.991168][    T1] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000dffff]
[    1.998143][    T1] pci_bus 0000:00: resource 7 [mem 0xd0000000-0xffffffff]
[    2.005119][    T1] pci_bus 0000:01: resource 0 [io  0x1000-0x1fff]
[    2.011394][    T1] pci_bus 0000:01: resource 1 [mem 0xd0000000-0xd00fffff]
[    2.018371][    T1] pci_bus 0000:02: resource 0 [io  0x2000-0x2fff]
[    2.024653][    T1] pci_bus 0000:02: resource 1 [mem 0xd0100000-0xd01fffff]
[    2.031629][    T1] pci_bus 0000:03: resource 0 [io  0x3000-0x3fff]
[    2.037904][    T1] pci_bus 0000:03: resource 1 [mem 0xd0200000-0xd02fffff]
[    2.045962][    T1] pci 0000:00:13.0: PME# does not work under D3, disabling it
[    2.053392][    T1] PCI: CLS 64 bytes, default 64
[    2.058159][    T1] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    2.065934][    T1] pci 0000:00:02.0: Adding to iommu group 0
[    2.071747][    T1] pci 0000:00:02.2: Adding to iommu group 1
[    2.077573][    T1] pci 0000:00:02.3: Adding to iommu group 2
[    2.083384][    T1] pci 0000:00:02.4: Adding to iommu group 3
[    2.089183][    T1] pci 0000:00:08.0: Adding to iommu group 4
[    2.094982][    T1] pci 0000:00:10.0: Adding to iommu group 5
[    2.100788][    T1] pci 0000:00:11.0: Adding to iommu group 6
[    2.106595][    T1] pci 0000:00:13.0: Adding to iommu group 7
[    2.112425][    T1] pci 0000:00:14.0: Adding to iommu group 8
[    2.118221][    T1] pci 0000:00:14.3: Adding to iommu group 8
[    2.124017][    T1] pci 0000:00:14.7: Adding to iommu group 8
[    2.129889][    T1] pci 0000:00:18.0: Adding to iommu group 9
[    2.135711][    T1] pci 0000:00:18.1: Adding to iommu group 9
[    2.141507][    T1] pci 0000:00:18.2: Adding to iommu group 9
[    2.147305][    T1] pci 0000:00:18.3: Adding to iommu group 9
[    2.153108][    T1] pci 0000:00:18.4: Adding to iommu group 9
[    2.158900][    T1] pci 0000:00:18.5: Adding to iommu group 9
[    2.164717][    T1] pci 0000:01:00.0: Adding to iommu group 10
[    2.170633][    T1] pci 0000:02:00.0: Adding to iommu group 11
[    2.176522][    T1] pci 0000:03:00.0: Adding to iommu group 12
[    2.182641][    T1] pci 0000:00:00.2: can't derive routing for PCI INT A
[    2.189355][    T1] pci 0000:00:00.2: PCI INT A: no GSI
[    2.194616][    T1] AMD-Vi: Extended features (0x800290ad2, 0x0): PPR GT IA GA PC GA_vAPIC
[    2.202905][    T1] AMD-Vi: Interrupt remapping enabled
[   23.540739][    T1] ------------[ cut here ]------------
[   23.546065][    T1] WARNING: CPU: 0 PID: 1 at drivers/iommu/amd/init.c:933 amd_iommu_enable_interrupts+0x5cf/0x8e0
[   23.556438][    T1] Modules linked in:
[   23.560210][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.3-gentoo #1
[   23.567454][    T1] Hardware name: PC Engines apu2/apu2, BIOS v4.19.0.1 01/31/2023
[   23.575037][    T1] RIP: 0010:amd_iommu_enable_interrupts+0x5cf/0x8e0
[   23.581504][    T1] Code: fd 70 88 12 ba 0f 85 b3 fe ff ff 48 c7 c7 a0 0e e5 b9 83 0d c2 52 a4 00 01 e8 3d 57 9f ff 48 8b 15 c6 5a ed 00 e9 d3 fb ff ff <0f> 0b 4f
[   23.600963][    T1] RSP: 0018:ffffaea140023dd0 EFLAGS: 00010246
[   23.606898][    T1] RAX: 0000000000000000 RBX: 00000000001e8480 RCX: 0000000000000000
[   23.614741][    T1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   23.622587][    T1] RBP: ffff88e340051800 R08: 0000000000000000 R09: 0000000000000000
[   23.630429][    T1] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000080000000
[   23.638272][    T1] R13: 000ffffffffffff8 R14: 0800000000000000 R15: 2000000000000000
[   23.646113][    T1] FS:  0000000000000000(0000) GS:ffff88e36ac00000(0000) knlGS:0000000000000000
[   23.654914][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   23.661365][    T1] CR2: ffff88e36efff000 CR3: 000000006100e000 CR4: 00000000000406f0
[   23.669210][    T1] Call Trace:
[   23.672378][    T1]  <TASK>
[   23.675192][    T1]  ? __warn+0x6f/0x140
[   23.679140][    T1]  ? amd_iommu_enable_interrupts+0x5cf/0x8e0
[   23.685000][    T1]  ? report_bug+0x1cc/0x210
[   23.689384][    T1]  ? handle_bug+0x3f/0x70
[   23.693586][    T1]  ? exc_invalid_op+0x1b/0x1b0
[   23.698224][    T1]  ? asm_exc_invalid_op+0x1a/0x20
[   23.703120][    T1]  ? amd_iommu_enable_interrupts+0x5cf/0x8e0
[   23.708971][    T1]  ? iommu_setup+0x2b0/0x2b0
[   23.713433][    T1]  state_next+0x1103/0x23a0
[   23.717812][    T1]  ? iommu_setup+0x2b0/0x2b0
[   23.722288][    T1]  amd_iommu_init+0x1b/0x50
[   23.726667][    T1]  pci_iommu_init+0xa/0x40
[   23.730957][    T1]  do_one_initcall+0x6d/0x2d0
[   23.735508][    T1]  kernel_init_freeable+0x2cd/0x3d0
[   23.740586][    T1]  ? rest_init+0xa0/0xa0
[   23.744702][    T1]  kernel_init+0x17/0x330
[   23.748905][    T1]  ? rest_init+0xa0/0xa0
[   23.753019][    T1]  ret_from_fork+0x2c/0x40
[   23.757307][    T1]  ? rest_init+0xa0/0xa0
[   23.761427][    T1]  ret_from_fork_asm+0x11/0x20
[   23.766061][    T1]  </TASK>
[   23.768956][    T1] ---[ end trace 0000000000000000 ]---
[   23.774288][    T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[   23.781440][    T1] software IO TLB: mapped [mem 0x00000000cbe94000-0x00000000cfe94000] (64MB)

(This may be a separate, unrelated issue but I thought I'd mention it here for the sake of completeness)