[ACPI] USB has major data corruption issues

markmi commented 5 months ago

This is more of a note about the FreeBSD status. [Later: Fedora 39 example too, where Windows 11 was pre-known.] I've no clue about assigning blame to FreeBSD vs. EDK2 vs. both at this point. (The testing context is UFS based.) But claiming FreeBSD is working well seems inappropriate, even though it does boot.

Doing the likes of, say, "pkg install -f -g 'FreeBSD-src*'" leads to a following "fsck_ffs /" (no write) reporting large numbers of inode checksum errors and such. (The pkg command populates /usr/src/sys/ and the rest of /usr/src/ with the FreeBSD source code for the FreeBSD vintage in question.) Using "fsck_ffs -y /" (writable) makes a mess, removing lots of files from the UFS filesystem and the like.

An interesting discovery was that avoiding the writable activity and doing "shutdown now" to single user mode, "mount -r /", "poweroff" then moving the the media to a RPi4B and booting via the normal U-Boot UEFI/fdt way does not end up seeing errors. This indicates the error reports were against in-memory information that had not been written out.

Doing sufficient RPi5 activity will start discovering and reporting the issues during normal RPi5 operation and bad data is then written that later activity will notice on read back.

One historically unusual thing that I've noted is the 512 KiByte hw.busdma.zone0.alignment that results from using this EDK2 implementation. More normal under FreeBSD has been 4 KiBytes. hw.busdma.zone0.lowaddr being 0xffffffff indicates a 32-bit subrange of the address space for the busdma (starting at address 0x0). 4 GiBytes and 512 KiByte alignment means only 8192 aligned positions exist in that 32-bit space.

# sysctl hw.busdma
hw.busdma.zone0.total_deferred_time: 0 0
hw.busdma.zone0.domain: 0
hw.busdma.zone0.alignment: 524288
hw.busdma.zone0.lowaddr: 0xffffffff
hw.busdma.zone0.total_deferred: 0
hw.busdma.zone0.total_bounced: 12018773
hw.busdma.zone0.active_bpages: 12
hw.busdma.zone0.reserved_bpages: 0
hw.busdma.zone0.free_bpages: 1227
hw.busdma.zone0.total_bpages: 1239
hw.busdma.total_bpages: 1239

(That was actually from booting and letting the RPi5 sit idle for days. A rather large total_bounced resulted.)

As of yet, I've not tested USB2 port use. Nor have I tested stable/14 or the like. Only the RPi5 seems to be generating these problems. (I can move the same media from aarch64 system to aarch64 system and boot with it and test things.)

mariobalanica commented 5 months ago

Thanks for reporting this, I've actually seen and noted similar issues with Windows. FreeBSD having an open XHCI driver should make it easier to see what exactly is going on.

One historically unusual thing that I've noted is the 512 KiByte hw.busdma.zone0.alignment that results from using this EDK2 implementation. More normal under FreeBSD has been 4 KiBytes. hw.busdma.zone0.lowaddr being 0xffffffff indicates a 32-bit subrange of the address space for the busdma (starting at address 0x0). 4 GiBytes and 512 KiByte alignment means only 8192 aligned positions exist in that 32-bit space.

Where is it pulling this DMA configuration from?

But claiming FreeBSD is working well seems inappropriate, even though it does boot.

Most of my testing was from an SD card, and that seemed to work well enough.

Note that the list of "supported OSes" is just a list of briefly tested functionality in ACPI mode, along with known issues. It does not claim or guarantee reliable operation, nor any official support for the OSes in question.

markmi commented 5 months ago

I made a personal FreeBSD build panic for alignments bigger than a page size in order to get a backtrace (that does not show inlined routines):

sdhci_acpi0: <Intel Bay Trail/Braswell SDXC Controller> iomem 0x1000fff000-0x1000fff25f irq 3 on acpi0
panic: alloc_bounce_zone: large alignment

cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
vpanic() at vpanic+0x1a4
panic() at panic+0x48
alloc_bounce_zone() at alloc_bounce_zone+0x500
bounce_bus_dma_tag_create() at bounce_bus_dma_tag_create+0x124
sdhci_init_slot() at sdhci_init_slot+0x9e4
sdhci_acpi_attach() at sdhci_acpi_attach+0x184
device_attach() at device_attach+0x3fc
bus_generic_new_pass() at bus_generic_new_pass+0x12c
bus_generic_new_pass() at bus_generic_new_pass+0xb8
bus_generic_new_pass() at bus_generic_new_pass+0xb8
root_bus_configure() at root_bus_configure+0x48
mi_startup() at mi_startup+0x1e0
virtdone() at virtdone+0x68
KDB: enter: panic
[ thread pid 0 tid 100000 ]
Stopped at      kdb_enter+0x4c: str     xzr, [x19, #3968]

Then I looked up the FreeBDS code that calls bus_dma_tag_create. Note the SDHCI_BLKSZ_SDMA_BNDRY_512K case:

        if (!(slot->quirks & SDHCI_QUIRK_BROKEN_SDMA_BOUNDARY)) {
                if (maxphys <= 1024 * 4)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_4K;
                else if (maxphys <= 1024 * 8)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_8K;
                else if (maxphys <= 1024 * 16)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_16K;
                else if (maxphys <= 1024 * 32)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_32K;
                else if (maxphys <= 1024 * 64)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_64K;
                else if (maxphys <= 1024 * 128)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_128K;
                else if (maxphys <= 1024 * 256)
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_256K;
                else
                        slot->sdma_boundary = SDHCI_BLKSZ_SDMA_BNDRY_512K;
        }
        slot->sdma_bbufsz = SDHCI_SDMA_BNDRY_TO_BBUFSZ(slot->sdma_boundary);

        /*
         * Allocate the DMA tag for an SDMA bounce buffer.
         * Note that the SDHCI specification doesn't state any alignment
         * constraint for the SDMA system address.  However, controllers
         * typically ignore the SDMA boundary bits in SDHCI_DMA_ADDRESS when
         * forming the actual address of data, requiring the SDMA buffer to
         * be aligned to the SDMA boundary.
         */
        err = bus_dma_tag_create(bus_get_dma_tag(slot->bus), slot->sdma_bbufsz,
            0, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL, NULL,
            slot->sdma_bbufsz, 1, slot->sdma_bbufsz, BUS_DMA_ALLOCNOW,
            NULL, NULL, &slot->dmatag);

alloc_bounce_zone only allocs when it does not already have a compatible alignment/lowaddr combination and 512K is compatible with the other alignments and happens to be the alignment for the first call. The 0xffffffffu lowaddr is also compatible. So no other allocations for other combinations were done.

I'm not so sure that this is contributing to the inode checksum failures and such. I forced the 512K to not be in use and still get oddities. So far it appears that "shutdown now" then "mount -r /" avoids actually creating a problem and allows fsck_ffs -y / to not find oddities.

I wonder if the "shutdown now" then "mount -r /" are forcing things to synchronize across cores or some such.

mariobalanica commented 5 months ago

Might be worth checking who's using those other bounce buffers, but yeah, it's super unlikely for this to be related.

I plan to look at it again in a few days, when I get the M.2 HAT and can also compare with an external USB3 card.

An obvious thing to try would be to properly initialize those DWC3 controllers with the appropriate quirk workarounds, since I don't really know in what state the VPU firmware is leaving them.

markmi commented 5 months ago

The most interesting discovery so far in my activity was that, after a non-writing "fsck_ffs /" that reports problems related to prior activity, the sequence "shutdown now" (to single user) and "mount -r /" and exit (back to multi-user) ends up with no corruptions and "fsck_ffs /" having everything pass. The experiments with pkg base even allows doing "pkg check -s -a" spanning the system-packages and ports-packages for file checksum validation for the official build that pkg base installed. The package files also get checksum validation when they are used.

How that sequence makes everything start to pass again without evidence of corruptions is not clear.

I'll note the same USB3 media used to boot and operate other EDK2 based UEFI/ACPI systems (e.g., a HoneyComb [LX2160A] and a Windows DevKit 2023) do not show the problems so far. This comparison/contrast is why I expect that the RPi5 EDK2 is likely involved in setting up the odd behavior. (No U-Boot UEFI/fdt style checking of the RPi5 FreeBSD use yet.)

mariobalanica commented 5 months ago

EDK2 does not do anything special to the USB controllers yet. It's fully relying on the closed VPU firmware to set-up RP1 peripherals and leave them configured.

markmi commented 5 months ago

I'm not so sure that the "shutdown now" (to single user) and "mount -r /" and exit (back to multi-user) sequence that makes the problems disappear until later activity gets the type of problem again fits with a USB controller issue. But it might.

markmi commented 5 months ago

Looking at the content of a corrupted text file, at the end of the file where the corruption is seen: (Note: Extracted from how view displayed it.)

    , typename apply2<ForwardOp,State, typena\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf

f\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf f\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf f\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf f\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf f\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff

mariobalanica commented 5 months ago

SD boot is totally fine, so the issue is limited to USB. I've had the exact same sort of corruption happen under Windows, but I haven't been able to consistently reproduce it so far.

I'm not so sure that the "shutdown now" (to single user) and "mount -r /" and exit (back to multi-user) sequence that makes the problems disappear until later activity gets the type of problem again fits with a USB controller issue. But it might.

Are you doing enough disk activity before "shutdown now" (i.e. giving it time to actually write back the corrupted data)?

Looking at the content of a corrupted text file, at the end of the file where the corruption is seen:

What is this file supposed to look like?

markmi commented 5 months ago

Other file corruptions look like random gibberish in the corrupted area.

A pattern is that the first file of several has the corruption at the end. Then the following files continue to have the corruption. Similarly, the last file of a block goes from gibberish back to normal text.

The files are from a "pkg update" that spans both base packages and port packages. "pkg check -s -a" reported mismatched checksums. I just looked at the content of some that should be textual, non-random.

mariobalanica commented 5 months ago

That's good to know, thanks.

markmi commented 5 months ago

Are you doing enough disk activity before "shutdown now" (i.e. giving it time to actually write back the corrupted data)?

As I described: I see corruption information from a non-write "fsck_fss /" before the "shutdown now" sequence. I've varied the amount of I/O activity and, if it is large enough, there are some problems after the sequence as well.

But having "fsck_ffs /" (non-write) report anything that later goes away via the "shutdown now" related sequence, suggests that the fsck_ffs reports that disappear were not based on what was actually on disk. Possibly a read-time problem, rather than just a write-time one? Rereading ends up with a good copy of the data?

More experiements do indicate that the "pkg check -s -a" reports do move around without updating the files that get newer reports.

FYI: The file with the \xff was a c++ header file.

markmi commented 5 months ago

Well, I dd'd bookworm vintage (2023-Dec) RasPiOS64 (my abbr.) to USB3 media and tried to boot with it (no EDK2 involved). Despite using the official 5.1V 5.0A power supply, I got:

USB boot requires high current (5 volt 5 amp) power supply.
To disable this check set usb_max_current_enable=1 in config.txt
or press the power button to temporarily enable usb_max_current_enable
and continue booting.
See https://rptl.io/rpi5-power-supply-info for more information

Turns out that I'd added a on/off switch extension to the end of the USB3 cable and it messes up the power negotiation, something I did not previously know.

This leads me to expect that having usb_max_current_enable=1 enabled in config.txt by default is not a good idea. Up to this point I thought I'd been testing with the recommended power but I had not been. I'd have learned up front that something was odd and needed investigation. (My context is serial console based.)

(Using the recommended power did not avoid having corruptions, however.)

markmi commented 5 months ago

I see in the ubuntu linux code a comment in usb_init_common_2712:

+   /*
+    * The BDC controller will get occasional failures with
+    * the default "Read Transaction Size" of 6 (1024 bytes).
+    * Set it to 4 (256 bytes).
+    */

The logic is conditional on some context not mentioned in the comment.

The referenced comment (and related USB material) is via: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux-raspi/+git/mantic/commit/?h=master-next&id=b847fd7a7d955dddf6a367b30544b4e1ffe689c2

markmi commented 5 months ago

Be forewarned that the following is based on a personal FreeBDS main [so: 15] build that used -mcpu=cortex-a76 and has my other personal patches.

I set up a microsd card boot media in order to avoid involving USB3 media. But the EtherNet is still based on the USB3 dongle that I've been using, plugged into the same USB3 port. I report this as it might be an indication of USB3 problems that affect the dongle use as well. (I've not done any USB2 port testing with anything at this point.) The dongle, Ethernet cabling, switch, and the like have a history that did not have problems, both recently and longer term.

Unlike all prior contexts using the dongle and EtherNet cable/switch-port combination, it is eventually ending up with the likes of:

ue0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=68009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        ether REDACTED
        inet 192.168.1.153 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (none <half-duplex>)
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

It had no carrier after the boot/login but disconnecting and reconnecting the dongle from the USB3 port got the carrier back --for a time:

Jan 26 01:28:17 CA76-UFS kernel: ugen1.2: <Realtek USB 10/100/1000 LAN> at usbus1 (disconnected)
Jan 26 01:28:17 CA76-UFS kernel: ure0: at uhub1, port 3, addr 1 (disconnected)
Jan 26 01:28:17 CA76-UFS kernel: rgephy0: detached
Jan 26 01:28:17 CA76-UFS kernel: miibus0: detached
Jan 26 01:28:17 CA76-UFS kernel: ure0: detached
Jan 26 01:28:19 CA76-UFS kernel: ugen1.2: <Realtek USB 10/100/1000 LAN> at usbus1
Jan 26 01:28:19 CA76-UFS kernel: ure0 on uhub1
Jan 26 01:28:19 CA76-UFS kernel: ure0: <Realtek USB 10/100/1000 LAN, class 0/0, rev 3.00/30.00, addr 1> on usbus1
Jan 26 01:28:19 CA76-UFS kernel: miibus0: <MII bus> on ure0
Jan 26 01:28:19 CA76-UFS kernel: rgephy0: <RTL8251/8153 1000BASE-T media interface> PHY 0 on miibus0
Jan 26 01:28:19 CA76-UFS kernel: rgephy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 1000baseT-FDX-master, auto
Jan 26 01:28:19 CA76-UFS kernel: ue0: <USB Ethernet> on ure0
Jan 26 01:28:19 CA76-UFS kernel: ue0: Ethernet address: REDACTED
Jan 26 01:28:19 CA76-UFS kernel: ue0: link state changed to DOWN
Jan 26 01:28:22 CA76-UFS kernel: ue0: link state changed to UP
Jan 26 01:28:24 CA76-UFS dhclient[1017]: New IP Address (ue0): 192.168.1.153
Jan 26 01:28:24 CA76-UFS dhclient[1024]: New Subnet Mask (ue0): 255.255.255.0
Jan 26 01:28:24 CA76-UFS dhclient[1028]: New Broadcast Address (ue0): 192.168.1.255
Jan 26 01:28:24 CA76-UFS dhclient[1033]: New Routers (ue0): 192.168.1.1
Jan 26 01:55:40 CA76-UFS kernel: ue0: link state changed to DOWN
Jan 26 02:07:38 CA76-UFS ntpd[724]: error resolving pool 0.freebsd.pool.ntp.org: Name does not resolve (8)
Jan 26 02:08:45 CA76-UFS syslogd: last message repeated 1 times
Jan 26 02:09:52 CA76-UFS syslogd: last message repeated 1 times
Jan 26 02:10:56 CA76-UFS ntpd[724]: error resolving pool 2.freebsd.pool.ntp.org: Name does not resolve (8)
Jan 26 02:10:59 CA76-UFS ntpd[724]: error resolving pool 0.freebsd.pool.ntp.org: Name does not resolve (8)
Jan 26 02:11:39 CA76-UFS ntpd[724]: no peer for too long, server running free now
Jan 26 02:12:00 CA76-UFS ntpd[724]: error resolving pool 2.freebsd.pool.ntp.org: Name does not resolve (8)
Jan 26 02:12:04 CA76-UFS ntpd[724]: error resolving pool 0.freebsd.pool.ntp.org: Name does not resolve (8)
Jan 26 02:13:07 CA76-UFS ntpd[724]: error resolving pool 2.freebsd.pool.ntp.org: Name does not resolve (8)
Jan 26 02:13:09 CA76-UFS ntpd[724]: error resolving pool 0.freebsd.pool.ntp.org: Name does not resolve (8)
Jan 26 02:14:13 CA76-UFS ntpd[724]: error resolving pool 2.freebsd.pool.ntp.org: Name does not resolve (8)

(And so on.) "ifconfig ue0 down; ifconfig ue0 up" does not restore the carrier status of itself.

markmi commented 5 months ago

Well, after that note, I tried moving the dongle from the upper USB3 port to the lower one. I've not had EtherNet carrier problems since then. (It is the only external USB* port connection present when I'm using the microsd card media.)

Back when I was testing USB3 media use as well, the dongle was plugged into the top USB3 port and the media was plugged into the bottom USB3 port.

markmi commented 5 months ago

On a Fedora 39 xfs file system (boot media) I did a "cat file1 | cat > file2" where file1 was something like 27 GiBytes in size. Then I did a diff of the 2 files. It got corruptions. Rebooting finds corruptions. I'm probably going to have to start my Fedora context from scratch.

I suggest indicating that RP1 USB is not working: Linux, FreeBSD, and Windows all get corrupted data.

mariobalanica commented 5 months ago

Yep, I've recently done some testing in Linux and seen corruption there as well. Unfortunately I haven't been able to consistently reproduce it across different configurations. There were moments when I could transfer dozens of gigabytes and it would all be fine - only sometimes would it actually start corrupting data badly.

Here's what I've tried, roughly:

Flashed stock RPI OS and checked USB -> OK
Modified DTB to expose a single USB controller as generic-xhci and hide the rest of the RP1 PCIe bus, relying on the VPU FW PCIe setup (to replicate what UEFI does) -> Corruption
Changed the generic-xhci device back to snps,dwc3 with original quirks, in case the VPU FW didn't take care of that -> Corruption
Reverted all the changes -> OK

Then I recompiled the RPi kernel to add ACPI support in and booted RPi OS via EDK2. Same story: ACPI is bad, FDT is good as long as I don't do the modifications above.

EDK2 itself appears to be unaffected, all transferred data came back good.

The only notable difference (that I'm currently aware of) between exposing the full RP1 PCIe bus vs. just the XHCI controller as a simple platform device is interrupts:

full RP1 PCIe in Linux with FDT uses MSIs
XHCI platform device uses the shared legacy INTA

I suspect the RP1 XHCI edge-triggered interrupts don't translate very well to level ones. This would be bad news for ACPI, since the MSI controller here is (unsurprisingly) non-standard and we're forced to use legacy interrupts.

This might also explain why EDK2 is not affected, as it uses polling instead.

I see in the ubuntu linux code a comment in usb_init_common_2712:

/*

The BDC controller will get occasional failures with

the default "Read Transaction Size" of 6 (1024 bytes).

Set it to 4 (256 bytes).

*/ The logic is conditional on some context not mentioned in the comment.

The referenced comment (and related USB material) is via: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux-raspi/+git/mantic/commit/?h=master-next&id=b847fd7a7d955dddf6a367b30544b4e1ffe689c2

Unrelated, this is code for the integrated DWC2 OTG port.

Well, I dd'd bookworm vintage (2023-Dec) RasPiOS64 (my abbr.) to USB3 media and tried to boot with it (no EDK2 involved). Despite using the official 5.1V 5.0A power supply, I got:

USB boot requires high current (5 volt 5 amp) power supply. To disable this check set usb_max_current_enable=1 in config.txt or press the power button to temporarily enable usb_max_current_enable and continue booting. See https://rptl.io/rpi5-power-supply-info for more information Turns out that I'd added a on/off switch extension to the end of the USB3 cable and it messes up the power negotiation, something I did not previously know.

This leads me to expect that having usb_max_current_enable=1 enabled in config.txt by default is not a good idea. Up to this point I thought I'd been testing with the recommended power but I had not been. I'd have learned up front that something was odd and needed investigation. (My context is serial console based.)

(Using the recommended power did not avoid having corruptions, however.)

That warning only appears when booting from USB with their firmware. Otherwise, unless you're using their odd PD-supply, you may run into less obvious issues with USB SSDs due to the artificial power limitation.

It's just confusing for no good reason.

markmi commented 5 months ago

Glad you added the note in the README. But I expect it would still be appropriate to change the RPi1 USB row in the "Supported peripherals" table to at least have a note of its own but to possibly indicate "Not working" instead: It is the first place one sees about USB's status and it contradicts the later note.

mariobalanica commented 5 months ago

That table refers to peripherals working within UEFI, not an OS. Perhaps I should make that clearer and move it below "Supported OSes".

markmi commented 5 months ago

May be try: "Only devices relevant to the just the EDK2 firmware (not OSs) are listed below. This need not be sufficient context for various OS's to work with EDK2."?

I just do not get the intended implications from the present text (not being the developer). Most folks are likely focused on what OS operations are enabled/working and expect that to be what is documented upfront. The above is explicit enough to likely avoid misinterpretation: The USB support happens to not be sufficient for use for any OS yet.

Anyway, such is my suggestion.

mariobalanica commented 5 months ago

Should be better now.

mariobalanica commented 5 months ago

Not involved. There are no constraints described on Pi 5.

markmi commented 5 months ago

I had answered my own question and delete the question shortly before you answered. So this reply will provide the context for readers that I'd asked about the odd 3 GiByte DMA code from the RPi4B that is present in the source tree (but unused).

markmi commented 5 months ago

I was surprised by the GICv2 references (instead of GICv3) on Fedora 39 Server. Prior mention of MSI/MSI-X use lead me to expect GICv3 as what was present. I've vague memories that MSI/MSI-X was a GICv3 addition beyond what GICv2 had.

# cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
  9:          0          0          0          0     GICv2  25 Level     vgic
 10:          0          0          0          0     GICv2  30 Level     kvm guest ptimer
 11:          0          0          0          0     GICv2  27 Level     kvm guest vtimer
 12:      10711       9322      10033       8752     GICv2  26 Level     arch_timer
 13:       1621          0          0          0     GICv2 153 Level     uart-pl011
 14:      37248          0          0          0     GICv2 261 Level     xhci-hcd:usb1, xhci-hcd:usb3
 15:        204          0          0          0     GICv2 305 Level     mmc1
 16:        579          0          0          0     GICv2 306 Level     mmc0
 17:          0          0          0          0     GICv2  48 Level     arm-pmu
 18:          0          0          0          0     GICv2  49 Level     arm-pmu
 19:          0          0          0          0     GICv2  50 Level     arm-pmu
 20:          0          0          0          0     GICv2  51 Level     arm-pmu
IPI0:       691        675        785        840       Rescheduling interrupts
IPI1:      4037       5296       7495       6521       Function call interrupts
IPI2:         0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0       CPU stop (for crash dump) interrupts
IPI4:         0          0          0          0       Timer broadcast interrupts
IPI5:         0          2          4          4       IRQ work interrupts
IPI6:         0          0          0          0       CPU wake-up interrupts
Err:          0

But, booting RasPiOS64 (not UEFI/ACPI) shows rp1_irq_chip and GICv2 and more:

# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  9:          0          0          0          0     GICv2  25 Level     vgic
 11:          0          0          0          0     GICv2  30 Level     kvm guest ptimer
 12:          0          0          0          0     GICv2  27 Level     kvm guest vtimer
 13:       2093       1313       1351       1718     GICv2  26 Level     arch_timer
 14:        105          0          0          0     GICv2  65 Level     107c013880.mailbox
 15:         83          0          0          0     GICv2 153 Level     uart-pl011
 21:          0          0          0          0     GICv2 118 Level     DMA IRQ
 22:          0          0          0          0     GICv2 119 Level     DMA IRQ
 23:          0          0          0          0     GICv2 120 Level     DMA IRQ
 24:          0          0          0          0     GICv2 121 Level     DMA IRQ
 27:          0          0          0          0     GICv2  48 Level     arm-pmu
 28:          0          0          0          0     GICv2  49 Level     arm-pmu
 29:          0          0          0          0     GICv2  50 Level     arm-pmu
 30:          0          0          0          0     GICv2  51 Level     arm-pmu
 38:       2842          0          0          0     GICv2 308 Level     ttyS0
 39:          0          0          0          0     GICv2 261 Level     PCIe PME, aerdrv
107:          0          0          0          0  rp1_irq_chip   6 Level     eth0
132:       8787          0          0          0  rp1_irq_chip  31 Edge      xhci-hcd:usb1
137:          0          0          0          0  rp1_irq_chip  36 Edge      xhci-hcd:usb3
141:          0          0          0          0  rp1_irq_chip  40 Level     dw_axi_dmac_platform
162:      13972          0          0          0     GICv2 306 Level     mmc1
163:          0          0          0          0     GICv2 305 Level     mmc0
164:          0          0          0          0  107d508500.gpio  20 Edge      pwr_button
165:          0          0          0          0     GICv2 150 Level     107d004000.spi
166:          0          0          0          0  intc@7d508380   1 Level     107d508200.i2c
167:          0          0          0          0  intc@7d508380   2 Level     107d508280.i2c
168:          0          0          0          0     GICv2 281 Level     v3d_core0
169:          0          0          0          0     GICv2 282 Level     v3d_hub
170:          0          0          0          0     GICv2 104 Level     pispbe
171:          0          0          0          0     GICv2 130 Level     1000800000.codec
172:          0          0          0          0  interrupt-controller@7c502000   2 Level     107c580000.hvs
173:          0          0          0          0  interrupt-controller@7c502000   9 Level     107c580000.hvs
174:          0          0          0          0  interrupt-controller@7c502000  16 Level     107c580000.hvs
175:          0          0          0          0  interrupt-controller@7d510600   7 Level     vc4 hdmi hpd connected
176:          0          0          0          0  interrupt-controller@7d510600   8 Level     vc4 hdmi hpd disconnected
177:          0          0          0          0  interrupt-controller@7d510600   2 Level     vc4 hdmi cec rx
178:          0          0          0          0  interrupt-controller@7d510600   1 Level     vc4 hdmi cec tx
179:          0          0          0          0  interrupt-controller@7d510600  14 Level     vc4 hdmi hpd connected
180:          0          0          0          0  interrupt-controller@7d510600  15 Level     vc4 hdmi hpd disconnected
181:          0          0          0          0  interrupt-controller@7d510600  12 Level     vc4 hdmi cec rx
182:          0          0          0          0  interrupt-controller@7d510600  11 Level     vc4 hdmi cec tx
183:          0          0          0          0  interrupt-controller@7c502000   1 Level     107c500000.mop
184:          0          0          0          0  interrupt-controller@7c502000   0 Level     107c501000.moplet
185:          0          0          0          0     GICv2 133 Level     vc4 crtc
186:          0          0          0          0     GICv2 142 Level     vc4 crtc
IPI0:       140        184        189        184       Rescheduling interrupts
IPI1:      4231      13914       5208       7676       Function call interrupts
IPI2:         0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0       CPU stop (for crash dump) interrupts
IPI4:         0          0          0          0       Timer broadcast interrupts
IPI5:       127         25         12         76       IRQ work interrupts
IPI6:         0          0          0          0       CPU wake-up interrupts
Err:          0

mariobalanica commented 5 months ago

GICv2 does have a standard extension for MSIs called "V2M", but RPi don't seem to have thought about implementing it. Instead there are two instances of a custom Broadcom MSI controller.

RP1 complicates things even further by using level-triggered interrupts for all peripherals other than XHCI (e.g. Ethernet), while MSIs are inherently edge-triggered. This of course requires special handling through yet another custom driver (mfd/rp1 in Linux).

markmi commented 5 months ago

I see one other edge (that may not matter):

164:          0          0          0          0  107d508500.gpio  20 Edge      pwr_button

markmi commented 5 months ago

Intersting: OpenSUSE Tumbleweed on RPi4B, via non-UEFI/ACPI vs. via EDK2 UEFI/ACPI:

 33:      43468          0          0          0  BRCM STB PCIe MSI 524288 Edge      xhci_hcd

vs.

 27:      68432          0          0          0     GICv2 175 Level     xhci-hcd:usb1

So it looks like the fdt based environment using MSI Edge for xhci_hcd is not new to the RPi5 of itself. Other aspects are new, of course. (Tumbleweed does not have synchronous exception issues with EDK2 use attempts, unlike Fedora 39.)

markmi commented 5 months ago

What looks different to me is that the RPi5 EDK2 is attempting to share one irq number across both xhci-hcd:usb1 and xhci-hcd:usb3. There is no analogous prior example of that for XHCI on the RPi*'s. (MSI style can not do such sharing, if I understand right.)

markmi commented 5 months ago

I looked at the Fedora USB3 media booting the HoneyComb (an EDK2 UEFI/ACPI boot context):

 99:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 112 Level     xhci-hcd:usb1
100:      29542          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 113 Level     xhci-hcd:usb3

So: separate irq numbers for the separate xhci-hcd:usb*'s.

mariobalanica commented 5 months ago

Edge interrupts can't be shared, correct. However, we're using a PCIe legacy "wire" IRQ, which is still signaled through MSIs under the hood, but must have proper level trigger behavior.

Sharing is not the issue here, since in my FDT testing above I had exposed only one of the controllers and there was still corruption.

RPi 4 does the same, but there's only one XHCI device connected to its bridge, and that's also a proper PCI device, not the RP1 contraption that we're dealing with here.

markmi commented 5 months ago

For the likes of the note "USB boot in ACPI mode is NOT recommended due to stability issues": as far as I can tell all USB use from an ACPI based boot is subject to the corruptions. So booting via a microsd card and then pugging in and using USB storage media is going to end up with the USB storage media corruption problems too.

markmi commented 5 months ago

So, lets see if I've finally I gather correctly. Issues such as ( quoting rp1-peripherals.pdf ):

If any level-based interrupts are required, then the interrupt-to-message
translation block (see PCIE MSIn_CFG section) must enable the IACK
mechanism to properly sequence software through the Pending, Active,
and EOI states. Interrupts may be missed by the host processor if this
feature is not used.

lead to needing to use what was declared in EDK2 via:

#define RP1_PCIE_MSIX_CFG_IACK_EN                   BIT3
#define RP1_PCIE_MSIX_CFG_IACK                      BIT2

--but that are currently unused. It may be unclear how to structure things to have RP1_PCIE_MSIX_CFG_IACK use happen when it needs to for the purpose indicated. (This may not be all there is to the problem.)

This affects all USB use via involvement of the 2 XHCI's.

Sound like a reasonable approximation to the basic issue?

mariobalanica commented 3 months ago

Fixed by https://github.com/worproject/edk2-platforms/commit/598d38a60911c53fc2d4068374560f11689f5e42

It was basically DMA corruption, occurring when the OS decided to allocate buffers right below 4 GB, where the VPU firmware allocated RP1's BARs. This part of memory is automatically reserved now.

The difference with FDT that I failed to notice was that it maps the inbound (system RAM) window way above the outbound one (where device BARs reside), while the VPU firmware maps inbound 1:1, overlapping 32-bit BARs. The former approach allows access to all RAM, but sadly ACPI OSes have poor / no support for DMA translation, so we have to take away a bit of usable RAM.

markmi commented 3 months ago

Looks like the release build with the fixed material has not been started yet.

mariobalanica commented 3 months ago

https://github.com/worproject/rpi5-uefi/actions/runs/8290667058

Gonna release a new version later today.

markmi commented 3 months ago

Yep, I saw that but also that the existing https://github.com/worproject/rpi5-uefi/actions/workflows/release.yml runs are 2 months old. Good to hear that another run will be started today sometime (in some timezone's day). I'm looking forward to downloading the release and putting it to use.

markmi commented 3 months ago

When I downloaded the release artifact by clicking on the https://github.com/worproject/rpi5-uefi/actions/runs/8290667058/artifacts/1328225071 link, it produced a RPi5_UEFI_Release_37e546a.zip that contained a file of that same name. That nested .zip in turn contained the 3 files of interest. Looks like using the link automatically added a wrapper .zip .

markmi commented 3 months ago

FYI: My testing of FreeBSD building packages from ports has been going for over 8 hrs and has not hit any problems so far. (But it is based on some personal kernel/world builds, not official ones.) I did make some non-default EDK2 configuration selections for this experimentation.

mariobalanica commented 3 months ago

New release is available.

When I downloaded the release artifact by clicking on the https://github.com/worproject/rpi5-uefi/actions/runs/8290667058/artifacts/1328225071 link, it produced a RPi5_UEFI_Release_37e546a.zip that contained a file of that same name. That nested .zip in turn contained the 3 files of interest. Looks like using the link automatically added a wrapper .zip .

Yeah, that was annoying. Fixed it.

FYI: My testing of FreeBSD building packages from ports has been going for over 8 hrs and has not hit any problems so far. (But it is based on some personal kernel/world builds, not official ones.) I did make some non-default EDK2 configuration selections for this experimentation.

Great to hear, thanks for testing!

worproject / rpi5-uefi

[ACPI] USB has major data corruption issues #3