Open thillux opened 5 years ago
I fiddled around against coreboot master. A patch solving this issue may looks like this: Patch against coreboot master
@thillux thank you for your contribution. Would you mind to submit pull request/send patches? Otherwise, we can commit on your behalf, if you don't mind.
I can send you a pull request. Which branch should I base my code on?
@miczyg1 please advise, but I assume recent develop
would be fine.
Indeed, develop would be the best
Update: I've found some bugs in my code and I am still working on it in my spare time. The bug message in kernel log already disappears, if
/* Bus 0, Dev 0 - F15 Host Controller */
Package(){0x0000FFFF, 0, 0, 28 },
is introduced. The other code lines in my patch are still under my review, as some of them seem to be incorrect or incomplete.
Questions:
@thillux could You set up a pull request? Just mark it with a [WIP] to indicate Your pending work on it. I would like to see the whole patch. Would be also great if You could provide the kernel log, version etc. AFAIU the 28 is the index number for the interrupt register as described in the IO 0xC00 register bits [6:0]. Unfortunately it does not map anywhere, just to 0x1F-0x1A reserved field (28 = 0x1C).
Description available in BKDG pages 680-683.
Regarding the values defined in mainboard.c
these are used with IO 0xc00 and IO 0xc01 registers to program the interrupt router as defined in BKDG pages 680-683 and they are not used for early setup. mainboard.c
is compiled into ramstage, which is mid-late boot stage. The interrupt programming for PCI devices is executed after PCI enumeration and resource assignment.
By DSDT parsing do You mean the operating system kernel that parses ACPI tables? What do You mean by cleared out again
?
Started work on pcengines/coreboot#292
dmesg output with pcengines/coreboot#292 dmesg output with 4.9.0.4
[ 1.691535] pci 0000:00:00.2: can't derive routing for PCI INT A [ 1.691542] pci 0000:00:00.2: PCI INT A: no GSI
Without interrupt routing over ACPI (boot kernel with acpi=noirq as parameter) there are many other routing issues besides pci 0000:00:00.2. dmesg_4.9.0.4_noirq.txt
Besides
[ 1.704636] pci 0000:00:00.2: can't find IRQ for PCI INT A; probably buggy MP table
there are also:
[ 1.229480] pci 0000:00:10.0: can't find IRQ for PCI INT A; probably buggy MP table [ 2.551039] xhci_hcd 0000:00:10.0: can't find IRQ for PCI INT A; probably buggy MP table [ 2.561307] sdhci-pci 0000:00:14.7: can't find IRQ for PCI INT A; probably buggy MP table
These other bugs should probably be squashed first, before working again on pcengines/coreboot#292.
@thillux thanks for Your effort. Regarding the other bugs in interrupt routing, I see inconsistency in MP table creation. I may setup a PR quickly with fixes, so You could test it. Your Arch Linux is crafted or rather a generic installation? I would like to test it myself too.
On this test box, I use a generic Arch Linux without kernel modifications.
I have done some research on IOMMU PCI interrupts and I have following conclusions:
InterruptLine. Read-write. Reset: 0. This field is read/write for software compatibility. It controls no
hardware.
Other PCI devices uses this field for interrupt routing.
acpi=noirq
as a parameter, but I got more errors. Kernel detected only 1 IOAPIC instead of 2, which led for example to following messages:
ERROR: Unable to locate IOAPIC for GSI 31
GSIs higher than 23 are handled by second IOAPIC which was not detected. So I would advise to not disable the ACPI IRQ settings.
Regarding these:
[ 1.229480] pci 0000:00:10.0: can't find IRQ for PCI INT A; probably buggy MP table
[ 2.551039] xhci_hcd 0000:00:10.0: can't find IRQ for PCI INT A; probably buggy MP table
[ 2.561307] sdhci-pci 0000:00:14.7: can't find IRQ for PCI INT A; probably buggy MP table
I see the problem and will apply a fix. However these warnings should be ignored. Investigating kernel source code confirms information in specs. Kernel should enable MSI, but the fact that IOMMu is a PCI device, the kernel's PCI generic init searches for INTx configuration, printing error that no INT for the device. Additionally having a look at lspci verbose output of IOMMU device:
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Device [1022:1567]
Subsystem: Advanced Micro Devices, Inc. [AMD] Device [1022:1567]
Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 0
Capabilities: [40] Secure device <?>
Capabilities: [64] MSI: Enable+ Count=1/4 Maskable- 64bit+
Address: 00000000fee0f00c Data: 4161
Capabilities: [74] HyperTransport: MSI Mapping Enable+ Fixed+
MSI is enabled Capabilities: [64] MSI: Enable+
and INTx is disabled: INTx-
in Status
Ok, thanks a lot! :thumbsup:
@thillux I agree this warning is confusing, we will try to send a patch to Linux kernel then. I think the kernel should not look for legacy interrupt routing, but use MSI instead.
A small update: ACPI tables, as well as PCI registers of IOMMU, should point to the MSI number which is used by IOMMU to signal interrupts. However, for an unknown reason, it is set to 0 by AGESA (AMD proprietary processor initialization code blob) which may result in such behavior. Unfortunately, it cannot be changed just like that, because the IOMMU PCI configuration is being locked by AGESA. I will try to tweak some bits to see whether I can do something about it.
Great, just let me know if I should test some changes.
@thillux thank you for the support. I may provide a binary for testing which might help.
BTW: Have you encountered similar issues like this? https://github.com/pcengines/coreboot/issues/285
I answered on pcengines/apu2-documentation#240. If I remember correctly, this messages originates from interrupt remapping not possible with legacy IRQs of WLE200NX (no MSI kernel module parameter used). IRQs then trigger memory accesses on unmapped areas (from an IOMMU perspective).
The APU2 IOMMU device gets no GSI under Linux.
[ 1.694982] pci 0000:00:00.2: PCI INT A: no GSI
Tested on Arch Linux
Linux mbdf 5.0.7-arch1-1-ARCH pcengines/coreboot#1 SMP PREEMPT Mon Apr 8 10:37:08 UTC 2019 x86_64 GNU/Linux