geerlingguy / raspberry-pi-pcie-devices

Raspberry Pi PCI Express device compatibility database
http://pipci.jeffgeerling.com
GNU General Public License v3.0
1.55k stars 142 forks source link

Geekworm X1011 PCIe to Four M.2 NVMe SSD HAT #618

Open geerlingguy opened 5 months ago

geerlingguy commented 5 months ago

hat-geekworm-x1011-4-drive-nvme

Geekworm's X1011 is a 4-drive M.2 NVMe SSD carrier which uses an ASmedia PCIe Gen 2 switch to bridge four NVMe drives to a Raspberry Pi 5.

It looks like it includes some Pogo pins to provide power to (or leech power from) the Pi 5's USB-C supply input. It has a warning to not plug in both the 5V barrel plug and Pi 5 USB-C power input at the same time (and it may not have any safety in place to prevent a short circuit if you do!).

It supports M.2 drives in 2230/2242/2260/2280 lengths, and does not support NVMe boot (because of the switch), at least not until Raspberry Pi resolves the issue Can't boot Pi 5 via NVMe behind PCIe switch / bridge.

geerlingguy commented 5 months ago

I ordered one of these today, hopefully it'll come in a few weeks.

It's on the site now: https://pipci.jeffgeerling.com/hats/geekworm-x1011-4-drive-nvme.html

paulwratt commented 5 months ago

Hmmm.. shame noone had thought to use PCIe v3 bridge chips (or even cheap v4) on such devices ..

.. when you get this in the studio, I noticed you had one of those PCIe "splitter" Hats, can you give this board a try this with that, like you did with those 2 boards in the latest video (with RPi5 Penta SATA Hat).

On that note, is that PCIe "spitter" HAT on loan as a review sample, as I would really like to some experiments with 2x (or more) of them (ie 4x PCIe HAT pssibilities), hence the mention of v3/v4 PCIe switch / bridge devices (which someone might get to now that I have mentioned it)

NOTE: I do realise that a PCIe base riser might be more "practical", but I have already seen enough "useful" PCIe Hats to warrent someone trying to put together a set of devices in low profile laptop form factor . Something that could "sit under your laptop" even .. maybe :)

geerlingguy commented 5 months ago

@paulwratt - Maybe even inside your laptop! Haha.

I would really like to see someone incorporate PCIe Gen 3 switch chips, though from what I hear, unlike the ASM1182e, they are harder to acquire in any bulk, and with any datasheets, for smaller companies.

hipboi commented 5 months ago

Radxa is working on one HAT with PCIe Gen3 switch - ASM2806, we had used it in other projects already. The HAT will be 2x PCIe Gen 3 uplink(Pi5 can only use 1x, some Radxa boards can use 2x)

4x PCIe Gen3 downlink:

geerlingguy commented 5 months ago

@hipboi - Oh cool! Would love to test it out. And hopefully the board's Wiki/product page has a compatibility matrix, that can show which models get which lane support (x1 or x2)!

luix93 commented 4 months ago

Hello, any updates on this product? I'm planning a little project with and this looks like exactly what I need, but there are no reviews whatsoever online

geerlingguy commented 4 months ago

@luix93 - I am running to Micro Center momentarily to buy some NVMe SSDs to test with it :)

Not sure if I'll do a full video but I will post some impressions once I get it going.

geerlingguy commented 4 months ago

x1011-raspberry-pi-5

It comes together nicely and includes a few standoffs you screw into the bottom to hold the NVMe drives just off the surface of a desk. There's no case yet, which is a shame. It'd be nice to have a case with some ventilation, and maybe a fan that blows some air over the ASM1182e chip, since that's the hottest part of this setup:

x1011-thermals

It's not bad, at 55°C or so after an hour of writes (a few TB written). I chose some cheap Inland NVMes because... they were cheap. They're TLC so probably not the best for write-heavy workloads, but in ZFS RAIDZ1, I was getting over 1 GB/sec writes and reads, even writing 120 GB of files, so ZFS caching was certainly in play. I haven't tested any other modes yet, nor did I set up Samba for network file copy testing, but it should be able to saturate the Pi's 1 Gbps port no problem.

I like the right-angle FFC, because it gives easy access to the microSD card:

x1011-pcie-ffc

And you can use either a 5V barrel plug power supply, or a USB-C power adapter. Power is supplied to the Pi (or from the Pi / USB-C connector) through pogo pins that touch test points on the bottom of the Pi 5. This setup is... interesting. I don't think I'd trust it for a permanent installation, but if you clean the contacts and maybe jiggle the springs every year or so, maybe it won't be an issue.

I didn't have any power warnings in my testing, nor throttling, even writing 120 GB to the NVMe array multiple times in a row (with iozone, doing 4K and 1024K cycles), and I was running it on an Argon GaN 27W USB-C power adapter:

argon-gan-5v-usb-c

There are four blinkenlights on top, which I like, though they're a slightly bright blue. Put some sort of cover over them if they get annoying:

x1011-asm1184e-pcie-switch-chip

And as luck would have it, the Inland SSDs also have their own green blinkenlights, so you get a light show no matter the board's orientation!

x1011-running-port-side-network

Upsides: very compact, fast enough for 1 Gbps use, everything worked without even tweaking the Pi OS settings (I didn't have to modify the boot config, it picked up the NVMe drives right away with latest Pi OS—this was on a brand new Pi 5 4GB I bought today).

Downsides: No official case (or any 3D printed designs I could find yet), PCIe Switch chip gets hot (55°C, so not terrible, but I'd at least throw a heatsink on it), PCIe Gen 2 switch limits total throughput, and pogo pins for power are a little sketchy, especially if you put in higher-power SSDs (some use 5W+ under load...).

Overall, if you have a lighter use NAS, or if you just want some redundancy with low latency storage, this isn't a bad option. I bought mine for $51 from Geekworm's website. Total cost for this build was around $225, including 4 256 GB NVMe SSDs, a 32GB microSD card for boot, and the power adapter.

And... it sounds like https://github.com/raspberrypi/firmware/issues/1833 is going to be resolved at some point in the near future (fingers crossed!), so you could even maybe boot from an NVMe drive behind the switch in the future! That'd be nice because you could have RAIDZ1 on 3 NVMe disks, and boot on another, for a speedy little NAS.

The biggest limiting factor using this for pure storage is the price of NVMe SSDs. Cost per terabyte goes way up as you get into the 4 and 8TB range.

luix93 commented 4 months ago

This is great! Thank you for doing this. Any chance you have nice picture of those pogo pins?

geerlingguy commented 4 months ago

Pogo power pins:

image

geerlingguy commented 4 months ago

Testing with an mdadm RAID0 array of all four drives, to get a feel for the performance without ZFS caching in front (with ZFS, the numbers are all amazing, over 1 GB/sec lol):

Benchmark Result
iozone 4K random read 46.43 MB/s
iozone 4K random write 138.80 MB/s
iozone 1M random read 418.14 MB/s
iozone 1M random write 393.81 MB/s
iozone 1M sequential read 430.45 MB/s
iozone 1M sequential write 398.13 MB/s
luix93 commented 4 months ago

Thank you so much this is amazing. Would you be able to test RAID5 if that's not too much to ask?

geerlingguy commented 4 months ago

RAID 5 results:

| Benchmark                  | Result |
| -------------------------- | ------ |
| iozone 4K random read      | 48.21 MB/s |
| iozone 4K random write     | 71.84 MB/s |
| iozone 1M random read      | 408.17 MB/s |
| iozone 1M random write     | 159.94 MB/s |
| iozone 1M sequential read  | 414.59 MB/s |
| iozone 1M sequential write | 162.81 MB/s |
luix93 commented 4 months ago

That's great, thank you so much man. Want to build a tiny nas and this seems like a good setup for what I have in mind. This has been very helpful! Glad to have discovered your channel :)

waszak commented 3 months ago

According to wiki it nows support "Support NVMe boot with the latest firmware EEPROM 2024/05/17 version" https://wiki.geekworm.com/X1011

EDIT. I tested on Samsung SSD 970 EVO Plus 1TB and it worked it booted from drive. I was scared for a moment because first boot took longer and I thought it doesn't work. But I wasn't patient. Next boot was fast.

mainLink0435 commented 1 month ago

Just sharing my experience in case anyone has a similar issue. I was running this with OMV, and booting to a 256gb NVME in slot 0, with a 4tb NVME in slot 1. I had constant issues (daily) with the device becoming non-responsive. If i was already SSH'd in, any command gave me an IO error. No web pages served, no SMB. Also, nothing in the logs (probably because it couldn't write to the logs).

I switched back to microSD and have had no issues.

paulwratt commented 1 month ago

@geerlingguy I would not worry too much about the pogo pin connection issues. Years ago, maybe, but for the last decade or so they have proven to be more than reliable. That said if a device with pogo pings has issues (after years of reliable use) I too would first clean and test springy-ness as my first solution choice.

BTW I am glad to see RPi OS now supports booting NVME behind a PCIe switch .. maybe they just didn't have the actual hardware to test it against (at the time)

ON the PCIe bridge availability front, you wold think Gen4 would be widely available, even for small outfits, and that there should be better availablity of PCIe Gen 3 switches, especially considering the amount of AMD AM4 internal GPU motherboards that are still being produced. (Even though AM4 CPU with internal GPU work across Gen4 PCI, the internals are Gen3, so the switch or bridge needs to be Gen3+Gen4) - as far as I understand you can put a PCIe Gen1 card in any PCIe slot, and it will work. The x16 are just data bits on the physical bust, and the Gen is actually a speed multiplier that has to be negotiated anyway (which is why some PCIe Gen5 gear only works reliably on Gen4 buses)

jedi58 commented 1 month ago

Just sharing my experience in case anyone has a similar issue. I was running this with OMV, and booting to a 256gb NVME in slot 0, with a 4tb NVME in slot 1. I had constant issues (daily) with the device becoming non-responsive. If i was already SSH'd in, any command gave me an IO error. No web pages served, no SMB. Also, nothing in the logs (probably because it couldn't write to the logs).

I switched back to microSD and have had no issues.

I'm having the same issue, and if I run sudo nvme list both of the drives I've got connected have disappeared. Reboot, and they're back (but in this case I'm already booting from the microSD as I never got around to moving the OS to one of the drives that are mounted). I suspect it could be a heat issue, as when I noticed and had rebooted, the smart data indicated the drive was still close to 60C after having had time to cool down (whereas earlier it'd been operating a large number of writes, sitting comfortably at 46C for hours). No alerts recorded in smart data though

mainLink0435 commented 1 month ago

@jedi58 It's definitely something to do with using the nvme as a boot disk (although I'm not smart enough to know any more detail than that). I switched back to sd card booting, and I'm using both nvme drives as storage - haven't had any errors, warnings, lockups, nothing. Been running for about 2 weeks without an issue.

jedi58 commented 1 month ago

I'm not convinced - I'm not using it as a boot disk, and still experience it - all connected NVMe drives just disappear with the I/O error until I reboot. Yesterday it did it during a 6hr file copy, today it did it on cold boot, but worked after a shutdown -r now so I could finish off the file copy that got interrupted yesterday

If it was just one drive, I'd assume it to be the drive; but both, feel it's something else. I'm inclined to think it's down to the amount of I/O and the ability for chip being used to switch being able to cope with it. I did wonder at one point if it was something such as the chip on the X1011 overheating

jedi58 commented 1 month ago

Digging into it a little further

$ sudo nvme list
Node                  Generic               SN                   Model                                    Namespace Usage                      Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------

$ sudo nvme list-subsys -vvv
scan controller nvme0
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme0
scan controller nvme0 namespace nvme0n1
failed to scan namespace nvme0n1
scan controller nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys0/nvme1
lookup subsystem /sys/class/nvme-subsystem/nvme-subsys1/nvme1
scan controller nvme1 namespace nvme1n1
failed to scan namespace nvme1n1
scan subsystem nvme-subsys0
scan subsystem nvme-subsys1
nvme-subsys1 - NQN=nqn.2018-01.com.wdc:guid:redacted
\
 +- nvme1 pcie 0000:05:00.0 dead
nvme-subsys0 - NQN=nqn.2018-01.com.wdc:guid:redacted
\
 +- nvme0 pcie 0000:03:00.0 dead

After a reboot, I sat with dmesg -wH running, and just used a watch command on it reporting on smartlog for both drives, monitoring temperature - no actual writes happening to either nvme device. After a while I see a lof of brcmfmac errors and then:

[  +0.000195] brcmfmac: brcmf_set_channel: set chanspec 0xd0a5 fail, reason -52
[  +7.227194] nvme nvme1: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
[  +0.000009] nvme nvme1: Does your device have a faulty power saving mode enabled?
[  +0.000003] nvme nvme1: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug
[  +0.083994] nvme 0000:05:00.0: Unable to change power state from D3cold to D0, device inaccessible
[  +0.000018] nvme nvme1: Disabling device after reset failure: -19
[  +0.015994] EXT4-fs (nvme1n1): shut down requested (2)
[  +0.005168] Aborting journal on device nvme1n1-8.
[  +0.000007] kworker/u12:2: attempt to access beyond end of device
              nvme1n1: rw=169985, sector=1950613504, nr_sectors = 8 limit=0
[  +0.000005] Buffer I/O error on dev nvme1n1, logical block 243826688, lost sync page write
[  +0.000005] JBD2: I/O error when updating journal superblock for nvme1n1-8.

That I'm not sure how to resolve

jedi58 commented 1 month ago

Eurgh… ignore the above. Found a page on the Geekworm site that lists the WD Black SN770 as being incompatible. At over £200 for the drives can't afford to replace them, so the X1011 and the drives were a waste of money for me then. I may as well spend ~£15 and get an M.2 PCIe adapter for my PC so I can at least use the drives, and yet another Raspberry Pi I've bought will be forgotten about 😂

mainLink0435 commented 1 month ago

OMG! Guess which disk I had as my boot drive.. WD SN740. Which appears on the 'incompatability list'. For me though I haven't had any issues reading/writing from the drives if the OS is on the SD card.

Pretty pathetic on Geekworm's part. Doesn't support Phison controller, doesn't support Polaris controller...

jedi58 commented 1 month ago

Geekworm suggest it's usable due to someone's comment (but they've not noticed the user also followed up on it):

I can confirm updating the firmware seems to work on the pi5. Have a WD black SN770. Worked with USB, didn't work with the NVMe hat. Originally would boot very slow, and couldn't reliably launch a terminal, or a web browser at all. System would freeze or go to gray screen/black screen.

Booted with the drive in a USB enclosure and went to raspi-config and loaded new bootloader/firmware through the advanced options, reinstalled the drive into the hat and now boots fine, fast, and can launch and run programs without issue.

But then that same user reports that it still fails with heavy usage. So if you're not booting off the drive, or using the drive much, should potentially be okay. I think I'll stick to accessing it over USB

mainLink0435 commented 1 month ago

oh great.. so it works.. if you don't use it much. Pretty pathetic on their part. Like you, didn't expect to buy this nvme hat that works with nvme drives (asterisk only SOME drives)

Already sunk cost at this point, so like you can't really justify starting again. Probably what Geekworm are banking on (literally)