Open akgnah opened 1 month ago
Indeed! I have one of these boards to test—I think there may have been some concerns over the interfaces working correctly on the Pi 5, and I'm planning to test it soon!
Important note: The version I have for testing is a PROTOTYPE, meaning there are some known issues. So take anything you read in this thread until the point Radxa releases a Pi 5-compatible version (if/when that happens) with a grain of salt!
On the site here: https://pipci.jeffgeerling.com/hats/radxa-dual-2.5g-router.html
For now I'm listing it as 'prototype' since I don't see it available for sale anywhere yet.
Just booting it the first time, a few observations:
lspci
is empty on first boot, though I have an older EEPROM on this Pi. Updating and testing again...After adding the following to /boot/firmware/config.txt
:
dtparam=pciex1
dtparam=pciex1_gen=3
And then rebooting, I get:
pi@pi5:~ $ lspci
0000:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries Device 2712 (rev 21)
0000:01:00.0 PCI bridge: ASMedia Technology Inc. Device 2806 (rev 01)
0000:02:00.0 PCI bridge: ASMedia Technology Inc. Device 2806 (rev 01)
0000:02:02.0 PCI bridge: ASMedia Technology Inc. Device 2806 (rev 01)
0000:02:06.0 PCI bridge: ASMedia Technology Inc. Device 2806 (rev 01)
0000:02:0e.0 PCI bridge: ASMedia Technology Inc. Device 2806 (rev 01)
0000:03:00.0 Non-Volatile memory controller: Phison Electronics Corporation PS5013 E13 NVMe Controller (rev 01)
0000:05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
0000:06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries Device 2712 (rev 21)
0001:01:00.0 Ethernet controller: Device 1de4:0001
For full output:
I don't see the NVMe drive with lsblk
though, so I wonder if I'm getting an error like I was with boot-behind-a-PCIe-switch... checking on that now.
I tried adding dtparam=pciex1_no_l0s=on
to /boot/firmware/config.txt
, and ... it says it's enabling the device in dmesg:
pi@pi5:~ $ dmesg | grep nvme
[ 2.623758] nvme nvme0: pci function 0000:03:00.0
[ 2.628787] nvme 0000:03:00.0: enabling device (0000 -> 0002)
But nothing in lsblk
. Not sure why the NVMe device is not showing. I might also try it on a different board through the included Pi 5-style FPC.
Also, testing one of the two 2.5 Gbps ports, I get lights when I plug it in, I get an IP address, and here's the port details:
pi@pi5:~ $ dmesg | grep r8169
[ 3.899196] r8169 0000:05:00.0: enabling device (0000 -> 0002)
[ 3.950971] r8169 0000:05:00.0 eth1: RTL8125B, 00:e0:4c:68:00:03, XID 641, IRQ 173
[ 3.959248] r8169 0000:05:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko]
[ 3.968735] r8169 0000:06:00.0: enabling device (0000 -> 0002)
[ 3.997217] r8169 0000:06:00.0 eth2: RTL8125B, 00:e0:4c:68:00:04, XID 641, IRQ 174
[ 4.005239] r8169 0000:06:00.0 eth2: jumbo features [frames: 9194 bytes, tx checksumming: ko]
[ 7.269731] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-500:00: attached PHY driver (mii_bus:phy_addr=r8169-0-500:00, irq=MAC)
[ 7.437581] r8169 0000:05:00.0 eth1: Link is Down
[ 7.461697] RTL8226B_RTL8221B 2.5Gbps PHY r8169-0-600:00: attached PHY driver (mii_bus:phy_addr=r8169-0-600:00, irq=MAC)
[ 7.650703] r8169 0000:06:00.0 eth2: Link is Down
[ 10.212037] r8169 0000:05:00.0 eth1: Link is Up - 2.5Gbps/Full - flow control rx/tx
Settings for eth1:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Link partner advertised pause frame use: Symmetric Receive-only
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 2500Mb/s
Duplex: Full
Auto-negotiation: on
master-slave cfg: preferred slave
master-slave status: slave
Port: Twisted Pair
PHYAD: 0
Transceiver: external
MDI-X: Unknown
Supports Wake-on: pumbg
Wake-on: d
Link detected: yes
Running a speed test with iperf3
:
pi@pi5:~ $ iperf3 -c 10.0.2.15
Connecting to host 10.0.2.15, port 5201
[ 5] local 10.0.2.227 port 59240 connected to 10.0.2.15 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 263 MBytes 2.21 Gbits/sec 0 503 KBytes
[ 5] 1.00-2.00 sec 261 MBytes 2.19 Gbits/sec 0 503 KBytes
[ 5] 2.00-3.00 sec 262 MBytes 2.20 Gbits/sec 0 503 KBytes
[ 5] 3.00-4.00 sec 260 MBytes 2.18 Gbits/sec 0 503 KBytes
[ 5] 4.00-5.00 sec 261 MBytes 2.19 Gbits/sec 0 527 KBytes
[ 5] 5.00-6.00 sec 261 MBytes 2.19 Gbits/sec 0 527 KBytes
[ 5] 6.00-7.00 sec 260 MBytes 2.18 Gbits/sec 0 527 KBytes
[ 5] 7.00-8.00 sec 260 MBytes 2.19 Gbits/sec 0 527 KBytes
[ 5] 8.00-9.00 sec 260 MBytes 2.18 Gbits/sec 0 527 KBytes
[ 5] 9.00-10.00 sec 262 MBytes 2.19 Gbits/sec 0 700 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.55 GBytes 2.19 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 2.55 GBytes 2.19 Gbits/sec receiver
pi@pi5:~ $ iperf3 -c 10.0.2.15 --reverse
Connecting to host 10.0.2.15, port 5201
Reverse mode, remote host 10.0.2.15 is sending
[ 5] local 10.0.2.227 port 33742 connected to 10.0.2.15 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 168 MBytes 1.41 Gbits/sec
[ 5] 1.00-2.00 sec 172 MBytes 1.44 Gbits/sec
[ 5] 2.00-3.00 sec 176 MBytes 1.48 Gbits/sec
[ 5] 3.00-4.00 sec 173 MBytes 1.45 Gbits/sec
[ 5] 4.00-5.00 sec 172 MBytes 1.45 Gbits/sec
[ 5] 5.00-6.00 sec 167 MBytes 1.40 Gbits/sec
[ 5] 6.00-7.00 sec 158 MBytes 1.32 Gbits/sec
[ 5] 7.00-8.00 sec 165 MBytes 1.38 Gbits/sec
[ 5] 8.00-9.00 sec 168 MBytes 1.41 Gbits/sec
[ 5] 9.00-10.00 sec 152 MBytes 1.28 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 1.63 GBytes 1.40 Gbits/sec sender
[ 5] 0.00-10.00 sec 1.63 GBytes 1.40 Gbits/sec receiver
pi@pi5:~ $ iperf3 -c 10.0.2.15 --bidir
Connecting to host 10.0.2.15, port 5201
[ 5] local 10.0.2.227 port 51256 connected to 10.0.2.15 port 5201
[ 7] local 10.0.2.227 port 51260 connected to 10.0.2.15 port 5201
[ ID][Role] Interval Transfer Bitrate Retr Cwnd
[ 5][TX-C] 0.00-1.00 sec 254 MBytes 2.13 Gbits/sec 0 516 KBytes
[ 7][RX-C] 0.00-1.00 sec 49.9 MBytes 419 Mbits/sec
[ 5][TX-C] 1.00-2.00 sec 257 MBytes 2.15 Gbits/sec 0 540 KBytes
[ 7][RX-C] 1.00-2.00 sec 45.7 MBytes 384 Mbits/sec
[ 5][TX-C] 2.00-3.00 sec 257 MBytes 2.15 Gbits/sec 0 540 KBytes
[ 7][RX-C] 2.00-3.00 sec 45.7 MBytes 383 Mbits/sec
[ 5][TX-C] 3.00-4.00 sec 247 MBytes 2.07 Gbits/sec 0 672 KBytes
[ 7][RX-C] 3.00-4.00 sec 44.4 MBytes 372 Mbits/sec
[ 5][TX-C] 4.00-5.00 sec 245 MBytes 2.06 Gbits/sec 0 833 KBytes
[ 7][RX-C] 4.00-5.00 sec 36.3 MBytes 305 Mbits/sec
[ 5][TX-C] 5.00-6.00 sec 254 MBytes 2.13 Gbits/sec 0 888 KBytes
[ 7][RX-C] 5.00-6.00 sec 38.3 MBytes 321 Mbits/sec
[ 5][TX-C] 6.00-7.00 sec 252 MBytes 2.12 Gbits/sec 0 936 KBytes
[ 7][RX-C] 6.00-7.00 sec 34.9 MBytes 293 Mbits/sec
[ 5][TX-C] 7.00-8.00 sec 250 MBytes 2.10 Gbits/sec 0 1.01 MBytes
[ 7][RX-C] 7.00-8.00 sec 35.4 MBytes 297 Mbits/sec
[ 5][TX-C] 8.00-9.00 sec 256 MBytes 2.15 Gbits/sec 0 1.01 MBytes
[ 7][RX-C] 8.00-9.00 sec 38.2 MBytes 320 Mbits/sec
[ 5][TX-C] 9.00-10.00 sec 256 MBytes 2.15 Gbits/sec 0 1.01 MBytes
[ 7][RX-C] 9.00-10.00 sec 34.5 MBytes 289 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-10.00 sec 2.47 GBytes 2.12 Gbits/sec 0 sender
[ 5][TX-C] 0.00-10.00 sec 2.47 GBytes 2.12 Gbits/sec receiver
[ 7][RX-C] 0.00-10.00 sec 403 MBytes 338 Mbits/sec sender
[ 7][RX-C] 0.00-10.00 sec 403 MBytes 338 Mbits/sec receiver
Regarding power draw, if I have the Pi powered off normally, with a base Pi setup and no tweaks, I see about 3W of power draw shut down.
Following my instructions to reduce poweroff power consumption on the Pi 5, I see it go down to 1.4W when it's shut down:
So the HAT tacks on about 1.4W of extra power draw.
If I unplug the NVMe drive, so that socket is empty on the top, the 2nd 2.5G Ethernet port isn't working. I see the 1st port, but port 2 doesn't seem to be showing up. If I swap my Ethernet cable from port 2 to port 1, port 1 lights up.
[ 3.759676] r8169 0000:05:00.0: enabling device (0000 -> 0002)
[ 3.819660] r8169 0000:05:00.0: error -EIO: PCI read failed
[ 3.826310] r8169: probe of 0000:05:00.0 failed with error -5
[ 3.833494] r8169 0000:06:00.0: enabling device (0000 -> 0002)
[ 3.836708] brcmstb-i2c 107d508200.i2c: @97500hz registered in interrupt mode
[ 3.852841] brcmstb-i2c 107d508280.i2c: @97500hz registered in interrupt mode
[ 3.864572] r8169 0000:06:00.0 eth1: RTL8125B, 00:e0:4c:68:00:04, XID 641, IRQ 173
[ 3.872764] r8169 0000:06:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko]
When I put the NVMe drive back in place, both 2.5G Ethernet ports work. Odd!
I also tested a Pineboards HatDrive! M.2 HAT plugged into the FFC on the board, and it didn't light up the PWR LED at all, nor did the drive show up at all under lspci
.
I've also tried using PCIe Gen 2, and the NVMe attached to the board, and it still shows up with lspci
, but doesn't list under lsblk
.
Even turning off ASPM (with pcie_aspm=off
inside /boot/firmware/cmdline.txt
), it doesn't mount the drive.
Testing with an Inland 256GB 2280 NVMe SSD, I see it with lspci:
0000:03:00.0 Non-Volatile memory controller: Silicon Motion, Inc. SM2263EN/SM2263XT SSD Controller (rev 03)
But again, no lsblk
. And again, it says enabling device, but no more past that:
pi@pi5:~ $ dmesg | grep nvme
[ 2.569973] nvme nvme0: pci function 0000:03:00.0
[ 2.575002] nvme 0000:03:00.0: enabling device (0000 -> 0002)
If I unplug the NVMe drive, so that socket is empty on the top, the 2nd 2.5G Ethernet port isn't working. I see the 1st port, but port 2 doesn't seem to be showing up. If I swap my Ethernet cable from port 2 to port 1, port 1 lights up.
[ 3.759676] r8169 0000:05:00.0: enabling device (0000 -> 0002) [ 3.819660] r8169 0000:05:00.0: error -EIO: PCI read failed [ 3.826310] r8169: probe of 0000:05:00.0 failed with error -5 [ 3.833494] r8169 0000:06:00.0: enabling device (0000 -> 0002) [ 3.836708] brcmstb-i2c 107d508200.i2c: @97500hz registered in interrupt mode [ 3.852841] brcmstb-i2c 107d508280.i2c: @97500hz registered in interrupt mode [ 3.864572] r8169 0000:06:00.0 eth1: RTL8125B, 00:e0:4c:68:00:04, XID 641, IRQ 173 [ 3.872764] r8169 0000:06:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko]
When I put the NVMe drive back in place, both 2.5G Ethernet ports work. Odd!
Odd! I kept the NVMe drive plugged in during the test, so I didn't notice the problem.
I also tested a Pineboards HatDrive! M.2 HAT plugged into the FFC on the board, and it didn't light up the PWR LED at all, nor did the drive show up at all under
lspci
.
In the current version, x1.0 (x for evaluation), the FFC seat is incorrectly rotated 180 degrees.
If you have a heat gun, you can remove it and re-solder it. We can also send you a corrected board.
I've also tried using PCIe Gen 2, and the NVMe attached to the board, and it still shows up with
lspci
, but doesn't list underlsblk
.
The NVMe device show up in lspci
but not in lsblk
, which has been troubling me. I've done a lot of searching and read many posts on Raspberry Pi forums, but nothing has helped.
But this HAT works well on Radxa's boards, such as the ROCK 5C, so I feel like it might not be a hardware issue.
@akgnah - Yeah, it's definitely an odd situation—usually there's at least an error message in dmesg
, but here... nothing! I wonder if the switch is throwing something in the Pi for a loop, this is the first time I've tried a Gen 3 switch (even at Gen 2 speed) on a Pi at all.
If anyone had access to a PCIe debugger that could probably help, but I am far away from being able to have one of those :D
Regarding the FPC, I may take a shot at reversing it, I finally have my hot air station set up and got power to it yesterday. I'll maybe have a go next week.
To be clear, if I remove it from the board, rotate the entire connector 180°, and re-solder it there, it should be wired up correctly? Thanks for the help!
To be clear, if I remove it from the board, rotate the entire connector 180°, and re-solder it there, it should be wired up correctly?
Yes, if it's rotated 180 degrees, its signal is correct.
I connected a M.2 HAT to the corrected connector, but the NVMe device still only appears in lspci
.
This might be something worth opening an issue for in the https://github.com/raspberrypi/firmware repo—it seems like it could be related to https://github.com/raspberrypi/firmware/issues/1833
Hi! Does the NIC functionality work without external power? A Raspberry Pi 5 with dual 2.5gbps would be an amazing OPNSense box, and if a low profile hat could provide that, it would be a dream come true.
I have been testing powering the Pi via the barrel jack on the HAT, so there's still just one power connection.
I'd be curious to see if nvme list
(from the nvme-cli
package) shows anything. If /dev/nvme0
shows up, you should be able to try some of the following:
nvme id-ctrl -H /dev/nvme0
nvme list-ns
nvme id-ns -H /dev/nvme0 -n 1
Most drives don't support more than one namespace, but perhaps that one namespace is missing or somehow not detected by the kernel?
@DanaGoyette
If you are interested in troubleshooting this, we can send you a sample to have some debugging fun. Please send me an email(tom@radxa.com) if you are interested.
I speculate that this is a compatibility issue with the ASMedia chip. When testing the Intel Optane SSD, I found that the Intel Optane SSD works fine directly on Raspberry Pi 5, but after passing through the ASM1184E bridge chip, it cannot function properly. It cannot be detected in lsblk, but other regular NVMe drives have no issues.
This board seems to use the ASM2806 as the main chip, and there may be some compatibility bugs.
Further testing is needed to find out. It is not sold on the official website, and it seems there are no other channels to purchase it from.
@tltangliang - Yeah, it sounds like Radxa are not wanting to release the board until they are certain it can be made to work (which I'm 100% in support of!), so this issue is currently more of a 'debug party' and I will continue to plug away as I can.
The FCC seat brings nearly 5 volts but no accessories is detected.
I tested the M.2 Slot with the Realtek R8125 driver and WD Blue SN570 NVMe
07: PCI 300.0: 0108 Non-Volatile memory controller (NVM Express) [Created at pci.386] Unique ID: svHJ.BG8T6CV0dd8 Parent ID: B35A.lPUYI3PvRI6 SysFS ID: /devices/platform/axi/1000110000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:00.0/0000:03:00.0 SysFS BusID: 0000:03:00.0 Hardware Class: storage Model: "Sandisk Non-Volatile memory controller" Vendor: pci 0x15b7 "Sandisk Corp" Device: pci 0x501a SubVendor: pci 0x15b7 "Sandisk Corp" SubDevice: pci 0x501a Memory Range: 0x1b00000000-0x1b00003fff (rw,non-prefetchable) Memory Range: 0x1b00004000-0x1b000040ff (rw,non-prefetchable) IRQ: 38 (no events) Module Alias: "pci:v000015B7d0000501Asv000015B7sd0000501Abc01sc08i02" Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #9 (PCI bridge)
and a Samsung NVMe 980 07: PCI 300.0: 0108 Non-Volatile memory controller (NVM Express) [Created at pci.386] Unique ID: svHJ.hLDNcFqboN7 Parent ID: B35A.lPUYI3PvRI6 SysFS ID: /devices/platform/axi/1000110000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:00.0/0000:03:00.0 SysFS BusID: 0000:03:00.0 Hardware Class: storage Model: "Samsung Electronics Non-Volatile memory controller" Vendor: pci 0x144d "Samsung Electronics Co Ltd" Device: pci 0xa809 SubVendor: pci 0x144d "Samsung Electronics Co Ltd" SubDevice: pci 0xa801 Memory Range: 0x1b00000000-0x1b00003fff (rw,non-prefetchable) IRQ: 38 (no events) Module Alias: "pci:v0000144Dd0000A809sv0000144Dsd0000A801bc01sc08i02" Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #9 (PCI bridge)
but nothing changed.
some output from hwinfo: 09: PCI 200.0: 0604 PCI bridge (Normal decode) [Created at pci.386] Unique ID: B35A.lPUYI3PvRI6 Parent ID: VCu0.lPUYI3PvRI6 SysFS ID: /devices/platform/axi/1000110000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:00.0 SysFS BusID: 0000:02:00.0 Hardware Class: bridge Model: "ASMedia PCI bridge" Vendor: pci 0x1b21 "ASMedia Technology Inc." Device: pci 0x2806 Revision: 0x01 Driver: "pcieport" IRQ: 39 (no events) Module Alias: "pci:v00001B21d00002806sv00000000sd00000000bc06sc04i00" Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #10 (PCI bridge)
and pcicrawler
00:00.0 root_port, speed 8GT/s, width x1 └─01:00.0 upstream_port, ASMedia Technology Inc. (1b21), device 2806 ├─02:00.0 downstream_port, slot 0, device present, speed 8GT/s, width x1 │ └─03:00.0 endpoint, Samsung Electronics Co Ltd (144d) NVMe SSD Controller 980 (a809) ├─02:02.0 downstream_port, slot 0, speed 2.5GT/s, width x1 ├─02:06.0 downstream_port, slot 0, device present, speed 5GT/s, width x1 │ └─05:00.0 endpoint, Realtek Semiconductor Co., Ltd. (10ec) RTL8125 2.5GbE Controller (8125) └─02:0e.0 downstream_port, slot 0, device present, speed 5GT/s, width x1 └─06:00.0 endpoint, Realtek Semiconductor Co., Ltd. (10ec) RTL8125 2.5GbE Controller (8125) 0001:00:00.0 root_port, speed 5GT/s, width x4 └─0001:01:00.0 endpoint, 1de4:0001
I didn't find any kernel modules or drivers for the asm2806. I compiled my own kernel with all pcie and nvme option on but also nothing changed. I also think the is a problem with the compatibility of the ASM2806
The dual 2.5G HAT and NVMe SSD work without any issue on ROCK 5C and 5A. We are out of clue here, maybe this is released?
I do think that has something to do with it, but I'm staring at the HAT right now sitting there tempting me to test it again... I just got back from two weeks of travel, so I will try to pick it up again soon—but alternatively, if there's someone else who I could send my prototype board to, and they could do more PCIe debugging, I'd be willing to do that!
I may not have time for a week or two to pick it up again.
@geerlingguy check the revision of the ASMedia chip itself (date code). Some older revisions are not playing well with the Pi 5.
@mikegapinski
From lspci
:
0000:01:00.0 PCI bridge: ASMedia Technology Inc. Device 2806 (rev 01)
And on the chip itself:
ASM2806 B3NG7727A1 2022
Also @DanaGoyette I ran the nvme
commands:
pi@pi5:~ $ sudo apt install -y nvm-cli
pi@pi5:~ $ nvme id-ctrl -H /dev/nvme0
/dev/nvme0: No such file or directory
pi@pi5:~ $ nvme list-ns
pi@pi5:~ $ nvme id-ns -H /dev/nvme0 -n 1
... same ...
The NVMe SSD I'm testing with uses a Silicon Motion controller SM2263EN/SM2263XT:
0000:03:00.0 Non-Volatile memory controller: Silicon Motion, Inc. SM2263EN/SM2263XT SSD Controller (rev 03) (prog-if 02 [NVM Express])
Subsystem: Silicon Motion, Inc. SM2263EN/SM2263XT (DRAM-less) NVMe SSD Controllers
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 38
Region 0: Memory at 1b00000000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/8 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <8us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x1 (downgraded)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ 10BitTagReq- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [b0] MSI-X: Enable- Count=16 Masked-
Vector table: BAR=0 offset=00002000
PBA: BAR=0 offset=00002100
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [158 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [178 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [180 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=0ns
L1SubCtl2: T_PwrOn=10us
Do you want me to run this with the drive directly attached to a Pi instead? It's not mounting the device anywhere.
I've also tried using various combinations of PCIe compatibility modes:
# In /boot/firmware/cmdline.txt
pcie_aspm=off
# In /boot/firmware/config.txt
dtoverlay=pineboards-hat-ai
# or
dtoverlay=pciex1-compat-pi5,no-mip
# or
dtparam=pciex1_gen=1
Chips made in 2024 (Q2+) are OK.
I don't know if it applies to the Gen 3 switch as well, but some changes were made to the Gen 2 ones recently.
If the drive you have doesn't work without the switch (on a HatDrive Bottom V3 or newer) it won't work.
On your lspci it seems like the endpoint is detected but the NVMe driver is not loaded. That's why you can't write to the drive 😅
If it does work standalone it is possible that an updated pcie switch solve everything for you. I have devices that only work on current switches regardless of what overlays are used. It's a mess overall, thankfully we've caught that in time. Dealing with Chinese chip makers is not always straightforward.
Wasn't sure about sharing that information but if our competitors are reading this: Source your components directly or ask for a photo of the reel 😅
Source your components directly or ask for a photo of the reel 😅
Ha, I'm guessing otherwise they might ship some old stock to clear it out! After all, on some systems and in many use cases, it's not a big deal and won't cause problems. I know I've seen a lot of USB expansion boards using the ASM2806.
The drive does work direct, or actually also with the ASM1182e in the Geekworm X1011 — see https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/618#issuecomment-2087888943
It's just the mix of specific devices with specific revisions on specific firmware. Everyone is improving constantly and you can exactly OTA internal firmware / silicon changes.
Generally speaking manufacturers don't have old inventory, you're ordering from current batches usually but MOQs are higher than when you go through a reseller (and they tend to have older revisions in stock)
I have a Rev 1.1 of this board. Changes appear to be the FPC connector fix and a copper heatsink on the switch.
I've only begun setting mine up but thought I would post my initial troubleshooting. All testing thus far has been with an updated version of Bookworm OS with a bootloader dated Wed 5 Jun 15:41:49 UTC 2024 (1717602109). Testing was done using gen3 speeds in config.txt.
I get the below output from lspci as others have already noted
0000:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 21)
0000:01:00.0 PCI bridge: ASMedia Technology Inc. ASM2806 4-Port PCIe x2 Gen3 Packet Switch (rev 01)
0000:02:00.0 PCI bridge: ASMedia Technology Inc. ASM2806 4-Port PCIe x2 Gen3 Packet Switch (rev 01)
0000:02:02.0 PCI bridge: ASMedia Technology Inc. ASM2806 4-Port PCIe x2 Gen3 Packet Switch (rev 01)
0000:02:06.0 PCI bridge: ASMedia Technology Inc. ASM2806 4-Port PCIe x2 Gen3 Packet Switch (rev 01)
0000:02:0e.0 PCI bridge: ASMedia Technology Inc. ASM2806 4-Port PCIe x2 Gen3 Packet Switch (rev 01)
0000:05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
0000:06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 21)
0001:01:00.0 Ethernet controller: Raspberry Pi Ltd RP1 PCIe 2.0 South Bridge
2.5 GB Adapters
Only 1 functional adapter using the r8169 driver. I get the same issue as @geerlingguy mentions above with the 2nd adapter
[ 3.488345] r8169: probe of 0000:05:00.0 failed with error -5
I get this error regardless if I have the M.2 slot populated. So far I've only tested one M.2 drive which may just be incompatible?
Speed of 2.5gb connection is good on the working adapter and showing 2.5gb link speed.
Settings for eth1:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Link partner advertised pause frame use: Symmetric Receive-only
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 2500Mb/s
Duplex: Full
Auto-negotiation: on
master-slave cfg: preferred slave
master-slave status: slave
Port: Twisted Pair
PHYAD: 0
Transceiver: external
MDI-X: Unknown
Supports Wake-on: pumbg
Wake-on: d
Link detected: yes
I hope to test the r8125 driver vs the r8169 to see if there are differences there.
M.2 Drive Only tested a Samsung SM961 which failed to appear in any log. I have verified I could see the drive on another PC. This drive didn't appear to be recognized and wasn't shown in lspci output. I plan to test additional drives in the near future. I suspect this is the reason I can't use the 2nd 2.5gb ethernet as it appears to be linked somehow to the M.2 slot based on previous finding.
FPC Connection Untested at this point but appears to be "fixed" as the connector is reversed from other images.
Hopefully more to follow in the coming days.
I've attempted to use the r8125 driver and found some interesting observations....
Not sure if I missed this above, and apologies if so, but after some more testing I found that I can't get both ethernet ports to work regardless of M.2 if using Gen3. If I drop to Gen2 then it's possible to get both ethernets IF I have a M.2 drive installed.
Below is a table that is the results of getting both 2.5g ethernet ports enabled. This goes for both the r8125 and r8169 drivers.
Gen 2 | Gen 3 | |
---|---|---|
M.2 Not Installed | NO | NO |
M.2 Installed | YES | YES* |
* Edit: Gen 3 works when using an Sk Hynix PC711 drive but did not work when using a Samsung SM961.
The behavior for when it doesn't work is the same except the drivers behave differently.
Behavior from NO cells in table.
Only one of the 2.5g Ethernet interface work while the other one gets the
[ 3.488345] r8169: probe of 0000:05:00.0 failed with error -5
error.
The working interface seems to behave normally as a 2.5gb NIC. M.2 is not detected in Gen 3.
M.2 is detected in Gen 2 but beyond being detected is not usable. Like mentioned in prevous comments it is not listed from lsblk.
Behavior from NO cells in table. Neither 2.5g Ethernet is available.
[ 5.868162] r8125: loading out-of-tree module taints kernel.
[ 5.871554] r8125 Ethernet controller driver 9.013.02-NAPI loaded
[ 5.871662] r8125 0000:05:00.0: enabling device (0000 -> 0002)
[ 5.923551] unknown chip version (7c800000)
That is the only thing in the logs and no interface becomes available to use. Behavior of M.2 is the same as with the r8169 driver.
The output from the YES cell is
[ 5.647918] r8125: loading out-of-tree module taints kernel.
[ 5.676861] r8125 Ethernet controller driver 9.013.02-NAPI loaded
[ 5.676938] r8125 0000:05:00.0: enabling device (0000 -> 0002)
[ 5.709959] r8125: This product is covered by one or more of the following patents: US6,570,884, US6,115,776, and US6,327,625.
[ 5.712052] r8125 Copyright (C) 2024 Realtek NIC software team <nicfae@realtek.com>
[ 5.712155] r8125 Ethernet controller driver 9.013.02-NAPI loaded
[ 5.712223] r8125 0000:06:00.0: enabling device (0000 -> 0002)
[ 5.764709] r8125: This product is covered by one or more of the following patents: US6,570,884, US6,115,776, and US6,327,625.
[ 5.766734] r8125 Copyright (C) 2024 Realtek NIC software team <nicfae@realtek.com>
I'm not sure what to make of all of this. It seems that some of you have both NICS working with Gen3 and an M.2 but so far that has not been the case for me with the Samsung SM961. I might try a different drive in a bit to see if it changes the above results in Gen3.
Is it possible the IPEX to FPC Cable is the issue here? I received 2 and haven't tried the 2nd one but that's honestly a bit of a stab in the dark at this point.
Apologies for the update spamming. I'm using this partially to document my progress as I work through this and if anyone has issues with it please let me know and I'll try to condense these. I'm hoping if I take a break from this it will potentially be helpful to someone else also.
Today I started looking into why the NVME was not loading properly.
[ 11.731965] nvme nvme0: pci function 0000:03:00.0
[ 11.736710] nvme 0000:03:00.0: enabling device (0000 -> 0002)
[ 11.793991] probe of 0000:03:00.0 returned 19 after 62120 usecs
The probe for NVME is failing with error code 19 (ENODEV). Spending just a bit of time looking through the nvme kernel source I believe this is the culprit based on the messages preceding it. https://github.com/torvalds/linux/blob/43db1e03c086ed20cc75808d3f45e780ec4ca26e/drivers/nvme/host/pci.c#L2491 Not sure why that is failing however.
Next I tried to enable Gen3 to see what differences I could discover.
NVME is not discovered at all using Gen3 and the first RTL8125 Ethernet fails instead.
Here is something to note. The first thing scanned on the ASMedia 2806 (NVME in Gen2 and 1st RTL8125 in Gen3) fails.
Gen 2 log
[ 11.434525] pci 0000:03:00.0: BAR 0: assigned [mem 0x1b00000000-0x1b00003fff 64bit]
Gen 3 log
[ 11.197574] pci 0000:05:00.0: BAR 2: assigned [mem 0x1b00000000-0x1b0000ffff 64bit]
The Gen 2 log is referencing the NVME and notice the start address. The Gen 3 is referencing the 1st RTL8125 and notice it has the same start address as the NVME when using Gen 2. Both of these devices fail to start in their respective boots. This could be circumstantial but the fact that I believe there is a read error in those memory spaces could be the reason they fail. Cause is still unknown however. I would like to try the FPC connector to see if this theory holds but that will have to wait a bit as I don't have another device on hand that I can try easily for a week or two. I believe the scan should come in this order 1. NVME, 2. FPC, 3. RTL8125, 4. RTL8125. So the FPC should fail instead of the 1st RTL if my theory holds.
Just a few bread crumbs and nothing concrete at this point.
A positive update today. I have been able to get gen3 working when using a different NVME drive. When swapping the Samsung SM961 for a cheap Sk Hynix PC711 I am able to detect the drive while using Gen3.
Poking at my last hypothesis about the memory region failing to read being an issue, I put together a hacked kernel to skip this region of memory when assigning BAR to the PCI devices. And it was successful! I now have both RTL8125's and an NVME drive accessible. All 3 devices seem to be functioning normally. This is obviously a hack and doesn't fix the root of the problem but it does show the device can work. I'm reaching the limit of my understanding on what could cause this however. If anyone is interested in my kernel source hack/patch let me know and I can post it somewhere.
Here's the fio report for the Sk Hynix
nvme0: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=libaio, iodepth=32
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=851MiB/s][r=3405 IOPS][eta 00m:00s]
nvme0: (groupid=0, jobs=1): err= 0: pid=1505: Sat Jul 13 22:44:31 2024
read: IOPS=3403, BW=851MiB/s (892MB/s)(24.9GiB/30010msec)
slat (nsec): min=8407, max=52371, avg=10176.20, stdev=466.58
clat (usec): min=2951, max=20340, avg=9390.19, stdev=311.07
lat (usec): min=2961, max=20350, avg=9400.36, stdev=311.06
clat percentiles (usec):
| 1.00th=[ 8979], 5.00th=[ 9372], 10.00th=[ 9372], 20.00th=[ 9372],
| 30.00th=[ 9372], 40.00th=[ 9372], 50.00th=[ 9372], 60.00th=[ 9372],
| 70.00th=[ 9372], 80.00th=[ 9372], 90.00th=[ 9372], 95.00th=[ 9372],
| 99.00th=[11338], 99.50th=[11338], 99.90th=[12649], 99.95th=[13435],
| 99.99th=[17171]
bw ( KiB/s): min=868864, max=874196, per=100.00%, avg=871721.25, stdev=1762.17, samples=60
iops : min= 3394, max= 3414, avg=3405.07, stdev= 6.88, samples=60
lat (msec) : 4=0.01%, 10=98.56%, 20=1.46%, 50=0.01%
cpu : usr=0.80%, sys=4.37%, ctx=102067, majf=0, minf=9
IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=102131,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=851MiB/s (892MB/s), 851MiB/s-851MiB/s (892MB/s-892MB/s), io=24.9GiB (26.8GB), run=30010-30010msec
Disk stats (read/write):
nvme0n1: ios=119055/0, merge=0/0, ticks=1118850/0, in_queue=1118850, util=99.73%
@KeyserSoze1 - oh wow! I just ordered a PC711 SSD to confirm on my end - do you have the kernel patch you applied to get it working? I wonder if it could be another quirk in the bus that could be worked around in the device tree?
Here's the kernel patch as applied to the latest 6.6.y branch.
[Jeff note: Pasted the contents here for easier reference]
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 826b5016a..81dc8625f 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -219,6 +219,20 @@ static int pci_bus_alloc_from_region(struct pci_bus *bus, struct resource *res,
max = avail.end;
+ /*
+ * WARNING: Below there be demons!
+ * This is a workaround hack for the Radxa Dual 2.5G Router HAT with a
+ * Pi 5. There is an issue with the first device behind the PCIE switch
+ * being assigned BAR in the initial segment of the memory region.
+ * Memory read errors occur when assigning this initial block. This
+ * HACK gives the Pi PCI Bridge an offset starting position which
+ * filters to all of the children avoiding this issue. This is a hack
+ * and NOT A FIX!
+ */
+ if (strcmp(((struct pci_dev*)alignf_data)->dev.kobj.name,
+ "0000:00:00.0") == 0)
+ min_used += 0x100000;
+
/* Don't bother if available space isn't large enough */
if (size > max - min_used + 1)
continue;
I'm not sure how to best handle this quirk. Hopefully someone else more knowledgeable can jump in and help create a more manageable fix to this than my absolute hack I did to verify the issue. I'm not sure why the initial memory region causes issues to begin with. It shouldn't from my understanding but obviously something is not working as intended there.
On first attempt I was not able to boot from NVME but I haven't tried any of the additional config.txt tweaks yet.
Radxa' Dual 2.5G Router HAT features a PCIe Gen3 switch, the ASM2806, enhancing its connectivity capabilities.
This HAT has 2x PCIe Gen 3 uplinks (RPi5 can only use 1x, some Radxa boards can use 2x) and 4x PCIe Gen 3 downlinks:
Things to test: