geerlingguy / raspberry-pi-pcie-devices

Raspberry Pi PCI Express device compatibility database
http://pipci.jeffgeerling.com
GNU General Public License v3.0
1.6k stars 145 forks source link

Test WD PC SN520 NVMe M.2 2230 SSD #90

Closed geerlingguy closed 3 years ago

geerlingguy commented 3 years ago

That's a lot of acronyms. Basically, I got four of these itty-bitty WD PC SN520 NVMe M.2 2230 SSDs:

DSC_4396

They are available in 128/256/512 GB capacities, and I bought the four I have on eBay, used, because the price on Amazon for the drive is a bit insane. I'm thinking these drives might not even be available on the market now, because I can't find them new anywhere besides, basically, Amazon.

Anyways, I have this dumb idea to build a tiny CM4 carrier board that has four B+M key M.2 slots, a space for the Pi, a power plug, microSD slot, and a network jack. It would use a PCIe switch to allow the four drives to connect to the Pi.

I don't know if it would work at all, but it might be fun to try.

For now, I just want to get one working to make sure it's not a completely invalid idea. I'm testing it in my MZHOU NVMe M.2 SSD M Key to PCIe 1x adapter.

geerlingguy commented 3 years ago
$ sudo lspci -vvv
...
01:00.0 Non-Volatile memory controller: Sandisk Corp Device 5004 (rev 01) (prog-if 02 [NVM Express])
    Subsystem: Sandisk Corp Device 5004
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Interrupt: pin A routed to IRQ 47
    Region 0: Memory at 600000000 (64-bit, non-prefetchable) [size=16K]
    Capabilities: [80] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
        Address: 0000000000000000  Data: 0000
    Capabilities: [b0] MSI-X: Enable+ Count=17 Masked-
        Vector table: BAR=0 offset=00002000
        PBA: BAR=0 offset=00003000
    Capabilities: [c0] Express (v2) Endpoint, MSI 00
        DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 unlimited
            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 128 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap: Port #0, Speed 8GT/s, Width x2, ASPM L1, Exit Latency L0s <256ns, L1 <8us
            ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR+, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [100 v2] Advanced Error Reporting
        UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
        CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
        AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
    Capabilities: [150 v1] Device Serial Number 00-00-00-00-00-00-00-00
    Capabilities: [1b8 v1] Latency Tolerance Reporting
        Max snoop latency: 0ns
        Max no snoop latency: 0ns
    Capabilities: [300 v1] #19
    Capabilities: [900 v1] L1 PM Substates
        L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
              PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
               T_CommonMode=0us LTR1.2_Threshold=262144ns
        L1SubCtl2: T_PwrOn=10us
    Kernel driver in use: nvme
    Kernel modules: nvme
geerlingguy commented 3 years ago

Too easy.

$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
mmcblk0     179:0    0  14.9G  0 disk 
├─mmcblk0p1 179:1    0   256M  0 part /boot
└─mmcblk0p2 179:2    0  14.6G  0 part /
nvme0n1     259:0    0 119.2G  0 disk
geerlingguy commented 3 years ago

Formatting and mounting:

$ sudo fdisk /dev/nvme0n1
(`n` for new, `p` for primary, select defaults, then `w` to write it out and exit)

$ sudo mkfs.ext4 /dev/nvme0n1

$ sudo mkdir /mnt/nvme
$ sudo mount /dev/nvme0n1 /mnt/nvme

Shows up with the other disks:

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        15G  1.6G   13G  12% /
devtmpfs        1.8G     0  1.8G   0% /dev
tmpfs           1.9G     0  1.9G   0% /dev/shm
tmpfs           1.9G  8.4M  1.9G   1% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/mmcblk0p1  253M   60M  193M  24% /boot
tmpfs           380M     0  380M   0% /run/user/1000
/dev/nvme0n1    117G   61M  111G   1% /mnt/nvme

Mount at boot:

$ ls -al /dev/disk/by-uuid/
(Copy out the UUID for the nvme0n1 drive)

$ sudo nano /etc/fstab
(Paste in a line like below:)
UUID=6938c09e-5714-4bb4-8ba9-73570960a91a /mnt/nvme   ext4    defaults        0       0

$ sudo mount -a
(Make sure there are no errors.)

Now you can have the drive appear at boot time!

geerlingguy commented 3 years ago

Benchmark:

$ curl -O https://raw.githubusercontent.com/geerlingguy/raspberry-pi-dramble/master/setup/benchmarks/disk-benchmark.sh
$ chmod +x disk-benchmark.sh
$ nano disk-benchmark.sh
(Change `DEVICE_UNDER_TEST` and `DEVICE_MOUNT_PATH`)

$ sudo ./disk-benchmark.sh

Results (all in MB/sec):

Benchmark Run 1 Run 2 Run 3 Average
hdparm 396.49 374.99 367.97 379.82 MB/sec
dd 155 153 154 154.00 MB/sec
4k rand read 39.89 39.57 39.64 39.70 MB/sec
4k rand write 92.25 88.06 87.99 89.43 MB/sec
geerlingguy commented 3 years ago

Here's the problem now... I don't have any 4-slot M.2 adapters with B+M key slots... and I only have one adapter that takes a Key E slot like I'm testing on the giant 12-M.2-slot board and adapts it to Key M:

IMG_3672

ThomasKaiser commented 3 years ago

Out of curiosity: do you get significantly better sequential transfer rates compared to hdparm if you test with iozone -e -I -a -s 1000M -r 1024k -i 0 -i 1 -f ${DEVICE_MOUNT_PATH}/iozone.

We've seen differences between 350 MB/s with hdparm (limited by its short execution time and especially anachronistic block size of just 128K) and 390 MB/s using appropriate block sizes and a reasonable execution time: https://forum.odroid.com/viewtopic.php?p=216569#p216569

geerlingguy commented 3 years ago

@ThomasKaiser - I will try to test that out next time I have these drives set up (right now I'm still working on cleaning up after that '16 hard drives' video :D

geerlingguy commented 3 years ago

I'm going to run another set of benchmarks using the updated benchmarking script with some suggestions based on what @ThomasKaiser said over in #64 —

Benchmark Result
fio 1M seq read 397MiB/s (417MB/s)
iozone 1M seq read 359.32 MiB/s
iozone 1M seq write 248.70 MiB/s
iozone 4k rand read 32.91 MiB/s
iozone 4k rand write 81.97 MiB/s
geerlingguy commented 3 years ago

IMG_3794

Next up, 3 of them in RAID 5:

# Install mdadm.
sudo apt install -y mdadm

# Create a RAID0 array using sda1 and sdb1.
sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1

# Create a mount point for the new RAID device.
sudo mkdir /mnt/babyraid5

# Format the RAID device.
sudo mkfs.ext4 /dev/md0

# Mount the RAID device.
sudo mount /dev/md0 /mnt/babyraid5

Following the progress with cat /proc/mdstat, it only seems like the 128 GB x3 RAID 5 array will take 12 minutes or so at 157 MiB/sec. Nice! (A lot faster than the 8 TB x3 array I was testing over the weekend...)

geerlingguy commented 3 years ago

In RAID 5 using mdadm:

Benchmark Result
fio 1M seq read 398MiB/s (417MB/s)
iozone 1M seq read 360.32 MiB/s
iozone 1M seq write 126.78 MiB/s
iozone 4k rand read 28.77 MiB/s
iozone 4k rand write 16.92 MiB/s
geerlingguy commented 3 years ago

Remove RAID 5 and set up as RAID 0:

sudo umount /mnt/babyraid5
sudo mdadm --stop /dev/md0
sudo mdadm --zero-superblock /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1
sudo mdadm --create --verbose /dev/md0 --level=0 --raid-devices=3 /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1
sudo mkdir /mnt/babyraid0
sudo mkfs.ext4 /dev/md0
sudo mount /dev/md0 /mnt/babyraid0

Test results in RAID 0:

Benchmark Result
fio 1M seq read 398MiB/s (417MB/s)
iozone 1M seq read 363.16 MiB/s
iozone 1M seq write 377.64 MiB/s
iozone 4k rand read 35.50 MiB/s
iozone 4k rand write 82.06 MiB/s
geerlingguy commented 3 years ago

Interesting to note, average CPU usage on the system (while doing nothing but running the benchmark):

Also interesting to note, it seems like iowait time is pegged to one of the four CPU cores. I'm not sure if NVMe storage can be multithreaded, but it's kind of a moot point as it was still going at 350-400 MiB/sec, which is the maximum you can sustain through the single Gen 2 PCIe lane.

geerlingguy commented 3 years ago

This card is mentioned in the TOFU video: https://www.youtube.com/watch?v=m-QSQ24_8LY

But it will be getting its own little dedicated video soon enough ;)

msgilligan commented 3 years ago

(I commented on your YouTube video before realizing that this would be the better place to do it.)

This card looks like a good fit for the Raspberry PI CM4 I/O board. I saw your video with a full-length NVMe adapter sticking up like a skyscraper. I googled around a little to see if there was an adapter sized for the 2230. Maybe a hacksaw would do the trick. Have you seen any adapters that will work for what I have in mind?

geerlingguy commented 3 years ago

@msgilligan - Good question; I am mostly using a right-angled adapter now (instead of the skyscraper one), so at least it's only about 50mm higher than the IO board, but it still is multi-size (up to 80mm), so it has a lot of wasted space. It would be nice to have a simple PCIe 1x to M.2 30mm slot adapter, but I haven't found one.

msgilligan commented 3 years ago

Do you have any photos and/or links of the right angle adapters installed on the IO board?

I'm tempted by the TOFU, but since I have the IO board on its way, I'm going to try to make do with it.

I just want a mini-dev system with a single fast SSD card and ideally a nice case (perhaps 3D printed)

geerlingguy commented 3 years ago

@msgilligan - This is the one I'm using right now: https://pipci.jeffgeerling.com/cards_m2/mzhou-nvme-m2-ssd-m-key-adapter.html

New video on it and NVMe boot coming out...soon :D (just finished up script this afternoon!).

msgilligan commented 3 years ago

OK, I just ordered one of those (through your link, of course) too! I'll try them both.

Maybe someone will publish a 3D design for a case...

Wow! I went to Thingiverse a few hours ago to look and there was only a fan bracket and a partial case for the "3" version. But just published in the last few hours is: https://www.thingiverse.com/thing:4799823

geerlingguy commented 3 years ago

@msgilligan - Ha, that's great timing! I'm still working out what I want to do in terms of a better case for longer-term projects. Right now, since I still only have one 'final production' IO board (the other one I have is very early prototype and has some defects), I've not committed to any kind of case.

What I'm hoping to do is use a PCIe riser / ribbon cable (I have a number to try from depending on orientation and whether I want open ended x1 or a full x16) to have a horizontally (parallel) mounted board right on top of the Pi itself, so it would just be the footprint of the CM4 but a little taller.

msgilligan commented 3 years ago

@geerlingguy Yeah, it's time for me to get a 3D printer -- I'm up to three projects that I want to use it for, plus my 17-year-old son wants one. It seems like you're happy with your Ender 3 V2 -- would you recommend it?

geerlingguy commented 3 years ago

@msgilligan - So far it's been a great value; a couple things like auto-bed-leveling are not present but easy enough to add after the fact. The heated glass bed and everything else has been great for a beginner like me.

msgilligan commented 3 years ago

@geerlingguy The printer has arrived and I think RPi CM4 IO Board Bumper Case might be the first thing I print: https://www.thingiverse.com/thing:4805258

msgilligan commented 3 years ago

@msgilligan - This is the one I'm using right now: https://pipci.jeffgeerling.com/cards_m2/mzhou-nvme-m2-ssd-m-key-adapter.html

This one won't fit in the metal enclosure

image

Jonson26 commented 3 years ago

Chainsaw it.

msgilligan commented 3 years ago

This is the temporary solution:

image

geerlingguy commented 3 years ago

Honestly, that metal case should have cutouts where the PCIe card can hang out.

msgilligan commented 3 years ago

Honestly, that metal case should have cutouts where the PCIe card can hang out.

Does anyone know of any (3D-printable or other) cases that work with the I/O board and an NVMe card?