geerlingguy / raspberry-pi-pcie-devices

Raspberry Pi PCI Express device compatibility database
http://pipci.jeffgeerling.com
GNU General Public License v3.0
1.52k stars 135 forks source link

Pi 5 HAT: Radxa Penta SATA HAT #615

Open geerlingguy opened 3 months ago

geerlingguy commented 3 months ago

Radxa sells an updated version of their Penta SATA HAT for $45, and it includes four SATA drive connectors, plus one edge connector for a 5th drive, 12V power inputs (molex or barrel jack) to power both the drives and the Pi 5 via GPIO, a cable for the 5th drive, an FFC cable to connect the HAT to the Pi 5, and screws for the mounting.

radxa-penta-hat

It looks like the SATA controller is a JMB585 PCIe Gen 3x2 SATA controller, so it could benefit from running the Pi 5's PCIe lane at Gen 3.0 speeds (setting dtparam=pciex1_gen=3 in /boot/firmware/config.txt). Radxa sent me a unit for testing.

geerlingguy commented 3 months ago

It's on the site now: https://pipci.jeffgeerling.com/hats/radxa-penta-sata-hat.html

I'll be testing and benchmarking soon!

geerlingguy commented 3 months ago

Some usage notes:

geerlingguy commented 3 months ago

One concern could be heating—the JMB585 SATA controller chip hit peaks of 60+°C in my testing:

radxa-penta-sata-hat-22

There is an official fan/OLED board, and that seems like it would be a wise choice for this build. It seems to also require the case, which is also announced but not available anywhere right now. See: https://forum.radxa.com/t/penta-sata-hat-is-now-available/20378

geerlingguy commented 3 months ago

And here's an illustration of the three heatsink fins I had to break off to get the HAT to fit:

radxa-penta-sata-hat-13

geerlingguy commented 3 months ago

I've also been monitoring IRQs and CPU affinity while doing network copies—the writes, specifically—and nothing really jumps out and suggests a bottleneck there (I'm reminded of this old Raspberry Pi linux issue):

Screenshot 2024-04-03 at 10 42 28 AM

This was in the middle of a 50 GB folder copy to a ZFS array. It is averaging 70 MB/sec or so, which is a fair bit less than line speed over the gigabit connection :(

@ThomasKaiser had suggested over in the Radxa forum there could be some affinity issues with networking on the Pi 5, but I don't see that via atop at least...

geerlingguy commented 3 months ago

Monitoring the CPU frequency with vcgencmd measure_clock arm, I do see it dipping down now and then, but mostly staying stable at 2.4 GHz (frequency(0)=2400033792). I will try performance and see if that gives any more consistent write speeds over the network.

I rebooted with force_turbo=1 in /boot/firmware/config.txt, and performed another copy (confirming the frequency was pegged at 2.4 GHz the whole time)... no difference. Still averaging around 70 MB/sec.

Here's htop as well:

Screenshot 2024-04-03 at 10 58 04 AM

And btop since it's pretty and shows similar data to atop in a more pleasing layout:

Screenshot 2024-04-03 at 11 01 54 AM
geerlingguy commented 3 months ago

I also tried NFS instead of Samba, by enabling it, creating a share 'shared', and connecting via Finder at nfs://10.0.2.214/export/shared (I had to glance on the Pi what the exports were, from OMV, with showmount -e localhost).

The copy was more stable around 82 MB/sec, but still no sign of a clear bottleneck in atop.

Screenshot 2024-04-03 at 11 21 47 AM

Unlike Samba, it looked like the nfsd process was pinned to CPU0, and atop showed the IRQ affinity was all on core 0 (which still seemed to have plenty of headroom—IRQ % never topped 20%, and CPU core0 usage stayed under 25% as well (the full CPU never reached above 50% during the copy):

Screenshot 2024-04-03 at 11 20 05 AM

I'm also going to try enabling compression in the ZFS pool, since it seems like I have plenty of CPU on the pi to handle it, and that can actually speed up writing through to the disk (though I don't think that's the bottleneck at all... just something to test that's easy and quick).

Result: ZFS Compression seems to make no difference—there's some more ZFS process CPU consumption, but overall the speed averages around the same 70 MB/sec...

Screenshot 2024-04-03 at 11 40 55 AM
ThomasKaiser commented 3 months ago

What about

echo 1 >/sys/devices/system/cpu/cpufreq/ondemand/io_is_busy
echo default > /sys/module/pcie_aspm/parameters/policy
geerlingguy commented 3 months ago
pi@pi-nas:~ $ sudo su
root@pi-nas:/home/pi# echo 1 >/sys/devices/system/cpu/cpufreq/ondemand/io_is_busy
root@pi-nas:/home/pi# echo default > /sys/module/pcie_aspm/parameters/policy
root@pi-nas:/home/pi# cat /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy
1
root@pi-nas:/home/pi# cat /sys/module/pcie_aspm/parameters/policy
[default] performance powersave powersupersave 

Still seeing the same sporadic performance:

Screenshot 2024-04-03 at 1 08 01 PM

(force_turbo is still set on, and clocks are still measuring at 2.4 GHz.)

geerlingguy commented 3 months ago

More stats on the share from the macOS client:

$ smbutil statshares -a

==================================================================================================
SHARE                         ATTRIBUTE TYPE                VALUE
==================================================================================================
--------------------------------------------------------------------------------------------------
shared                        
                              SERVER_NAME                   10.0.2.214
                              USER_ID                       501
                              SMB_NEGOTIATE                 SMBV_NEG_SMB1_ENABLED
                              SMB_NEGOTIATE                 SMBV_NEG_SMB2_ENABLED
                              SMB_NEGOTIATE                 SMBV_NEG_SMB3_ENABLED
                              SMB_VERSION                   SMB_3.1.1
                              SMB_ENCRYPT_ALGORITHMS        AES_128_CCM_ENABLED
                              SMB_ENCRYPT_ALGORITHMS        AES_128_GCM_ENABLED
                              SMB_ENCRYPT_ALGORITHMS        AES_256_CCM_ENABLED
                              SMB_ENCRYPT_ALGORITHMS        AES_256_GCM_ENABLED
                              SMB_CURR_ENCRYPT_ALGORITHM    OFF
                              SMB_SIGN_ALGORITHMS           AES_128_CMAC_ENABLED
                              SMB_SIGN_ALGORITHMS           AES_128_GMAC_ENABLED
                              SMB_CURR_SIGN_ALGORITHM       AES_128_GMAC
                              SMB_SHARE_TYPE                DISK
                              SIGNING_SUPPORTED             TRUE
                              EXTENDED_SECURITY_SUPPORTED   TRUE
                              LARGE_FILE_SUPPORTED          TRUE
                              FILE_IDS_SUPPORTED            TRUE
                              DFS_SUPPORTED                 TRUE
                              FILE_LEASING_SUPPORTED        TRUE
                              MULTI_CREDIT_SUPPORTED        TRUE
                              MULTI_CHANNEL_SUPPORTED       TRUE
                              SESSION_RECONNECT_TIME        2024-04-03 13:09:10
                              SESSION_RECONNECT_COUNT       1

And Samba version on the Pi:

root@pi-nas:/home/pi# /usr/sbin/smbd --version
Version 4.17.12-Debian
ThomasKaiser commented 3 months ago

Do you do only Finder copies (AKA 'network + storage combined' plus various unknown 'optimization strategies') or have you already tested network and storage individually? A quick iperf3 / iperf3 -R run between Mac and RPi and iozone on your array should be enough.

And for Samba performance I came up with these settings when I wrote the generic 'OMV on SBC' install routine over half a decade ago: https://github.com/armbian/build/blob/e83d1a0eabcc11815945453d58e1b9f4e201de43/config/templates/customize-image.sh.template#L122

geerlingguy commented 3 months ago

I've tested iperf3 on this setup a few times—on the 1 Gbps port, I get 940 Mbps up, 940 Mbps down (doing --reverse), and iozone with a filesize of 10 GB at 1M block size gets me 1.5 GB/sec random read, 1.5 GB/sec random write (obviously affected by caching in the front for ZFS).

I set it to 50 GB to try to bypass more of the cached speed (since this is an 8 GB RAM Pi 5):

    Iozone: Performance Test of File I/O
            Version $Revision: 3.492 $
        Compiled for 64 bit mode.
        Build: linux-arm 

    Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                 Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                 Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                 Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                 Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                 Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                 Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
                 Vangel Bojaxhi, Ben England, Vikentsi Lapa,
                 Alexey Skidanov, Sudhir Kumar.

    Run began: Wed Apr  3 13:27:54 2024

    Include fsync in write timing
    O_DIRECT feature enabled
    Auto Mode
    File size set to 51200000 kB
    Record Size 1024 kB
    Command line used: ./iozone -e -I -a -s 50000M -r 1024k -i 0 -i 2 -f /tank/shared/iozone
    Output is in kBytes/sec
    Time Resolution = 0.000001 seconds.
    Processor cache size set to 1024 kBytes.
    Processor cache line size set to 32 bytes.
    File stride size set to 17 * record size.
                                                              random    random     bkwd    record    stride                                    
              kB  reclen    write  rewrite    read    reread    read     write     read   rewrite      read   fwrite frewrite    fread  freread
        51200000    1024  1667845  1719853                    1492992  1479166                                                                
geerlingguy commented 3 months ago

Testing from Windows 11 on the same network, reads maxed out at 110 MB/sec, just like on the Mac.

smb copy down 110mb-ps

Writes... are getting a consistent 108 MB/sec. (It did get a little more up-and-down around the halfway point, where the below screenshot was taken, but still averages above 105 MB/sec.)

smb write 108 mb-ps

Screenshot 2024-04-03 at 1 45 29 PM

Now I'm shaking my fist strongly at my Mac—why does Apple have to hate GPLv3 so much!? Will try to see if there's a way to see what's going on with macOS Finder. I've heard from @jrasamba that macOS might try using packet signing (see article), which could definitely result in different performance characteristics. Not sure about Windows 11's defaults.

Maybe I have to ditch using my Mac as the 'real world performance' test bed... with other networking stuff it's not an issue. And I know Finder's terrible... I just didn't think it was that terrible. :P

(@jrasamba also suggested watching this video on io_uring with some good general performance tips.)

ThomasKaiser commented 3 months ago

I've heard from @jrasamba that macOS might try using packet signing

You can check with smbstatus on the RPi (should be SMB3_11 and partial(AES-128-GMAC) as protocol revision and signing status with recent macOS versions) or on macOS with smbutil statshares -m /path/to/volume.

As for GPL or not, IIRC Apple always used an SMB client that was derived from *BSD. Only for the SMB server component license issues came into play when Apple replaced Samba with their smbx. But they had another good reason since starting from 10.8 or 10.9 we were able to transfer Mac files flawlessly between Macs via SMB since all the HFS+ attributes were properly mapped via SMB unlike with Samba.

Am about to setup tomorrow a q&d local Netatalk instance on a RPi 5 to restore a TM backup for a colleague on a MacBook to be shipped to her. But since this sounds like fun I might try to do the excercise with Samba instead and see whether the Samba tunables developed over half a decade ago are still important or not. No idea whether spare time allows or not...

geerlingguy commented 3 months ago

@ThomasKaiser - see above (https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/615#issuecomment-2035286025) — SMB_CURR_SIGN_ALGORITHM AES_128_GMAC does that mean it's enabled?

ThomasKaiser commented 3 months ago

SMB_CURR_SIGN_ALGORITHM AES_128_GMAC does that mean it's enabled?

Yes. Sorry haven't seen the whole comment.

geerlingguy commented 3 months ago

I tried adding server signing = No to the smb.conf on the Pi (and restarting it), and also tried disabling signing on the macOS side:

printf "[default]\nsigning_required=no\n" | sudo tee /etc/nsmb.conf >/dev/null

Doesn't seem to make a difference either in the file copy speed, nor in the smbutil statshares -a output... not sure if it's supposed to disable the signing, or if that's even an accurate reporting.

I also tried setting delayed_ack=0 on the Mac as suggested here:

$ sudo sysctl -w net.inet.tcp.delayed_ack=0
Password:
net.inet.tcp.delayed_ack: 3 -> 0

I unmounted and re-mounted the share, and I'm still seeing the same performance. (So I set it back to 3.)

I'm going to reboot the Pi and Mac entirely and try again. (I had just unmounted the share, restarted smbd on the Pi, and re-mounted the share.)

geerlingguy commented 3 months ago

Reboot changed nothing, so I had a gander at the SMB config documentation, and found the client signing variable might need to be disabled?

client signing = disabled

I re-mounted the share, but it's still showing as AES_128_GMAC for the current algorithm... however, I saw in this Reddit thread that maybe the key is SIGNING_ON, which is not present, which seems to indicate that's not the issue at all, as it's not enabled.

geerlingguy commented 3 months ago

One last little nugget is I was debugging SMB via debug logging (easy enough to enable via OMV's UI), and I noticed there are actually two log files that are being written to when I'm working on my Mac:

-rw-r--r--  1 root root 229K Apr  3 14:51 log.10.0.2.15
...
-rw-r--r--  1 root root 431K Apr  3 14:54 log.mac-studio
-rw-r--r--  1 root root 1.1M Apr  3 14:50 log.mac-studio.old

I wonder if there's any possibility of SMB doing some kind of internal thrashing when it sees my Mac as both IP 10.0.2.15 and local hostname mac-studio?

ThomasKaiser commented 3 months ago

The net.inet.tcp.delayed_ack reference is the 'Internet at work': outdated stuff being copy&pasted over and over again :)

Signing could really be the culprit (just searched through my OMV 'career'). Since I just replaced a M1 Pro MBP with an M3 Air I checked defaults (or what I believe the defaults are):

tk@mac-tk ~ % cat /etc/nsmb.conf 
[default]
signing_required=no

Not required doesn't mean disabled. Unfortunately I'm on macOS 14 for just a couple of days (the lazy guy trying to skip every other macOS release) and am not into all the details yet...

geerlingguy commented 3 months ago

Note that on my Mac, I didn't have anything in place in /etc/nsmb.conf (I had to create the file). And regarding delayed_ack, once I get through anything that makes sense, I enjoy throwing things at the wall and seeing what sticks. And if it doesn't, I can quickly revert ;)

Even with client signing = disabled, nothing changed in the mount, and I don't see SIGNING_ON TRUE, so I would assume it's not on (searching around, it looks like if it's enabled for a share, it will show up like that, and not just SIGNING_SUPPORTED TRUE.

ThomasKaiser commented 3 months ago

I was debugging SMB via debug logging

This can and will harm SMB performance (bitten by this several times). But I guess you also tried it with log settings set to info and 'performance' was the same?

geerlingguy commented 3 months ago

This can and will harm SMB performance (bitten by this several times). But I guess you also tried it with log settings set to info and 'performance' was the same?

I only had it set to debug for about 3 minutes while I was replaying the copy, to get a snapshot of the log. Then set it right back to 'None' (which is the default in OMV). None of the performance data in this issue that I've posted was taken at any time when any smbd logging was enabled.

geerlingguy commented 3 months ago

Shakes fist at Apple:

Screenshot 2024-04-03 at 3 18 16 PM

If I just use Transmit to do an SFTP transfer (file transfer via SSH), I get a solid 115 MB/sec write speed. Going to test a file copy via Terminal straight to the SMB share next, to verify it's not some idiotic issue with Finder itself... stranger things have happened.

ThomasKaiser commented 3 months ago

to verify it's not some idiotic issue with Finder itself

Maybe Apple's most idiotic software piece ever :)

Back in the days when network/storage testing was a huge part of my day job I always used Helios LanTest since being limited in some ways (explained here) showing the performance differences/increases you were aiming for when debugging settings while Windows Explorer and Finder do a lot under the hood that masquerades basic network setting mismatches due to parallelisms and automagically tuned settings like block sizes.

geerlingguy commented 3 months ago

Just using cp in the Terminal was slower (it started out at 82 MB/sec but dropped down after a couple minutes—nothing on the Pi side seemed to indicate any issues or caches getting filled), but at least it was highly consistent in its transfer rate:

Screenshot 2024-04-03 at 3 21 58 PM

And rsync was also highly consistent, but slower still, at like 27 MB/sec (using -avz):

Screenshot 2024-04-03 at 3 23 42 PM
geerlingguy commented 3 months ago

And if I use the 2.5 Gbps connection on the Pi 5 (through HatNET! 2.5G), I can get a very consistent 150 MB/sec write speed through to the four SATA SSDs on my PC:

2 5g copy 150 mb-ps

And it looks like at that speed, using a PCIe switch that's rated at Gen 2.0 limits the total throughput to 150 MB/sec because it just doesn't have the capacity for anything more:

Screenshot 2024-04-03 at 4 12 38 PM

I'd love to get a PCIe Gen 3 switch, but those are hard to find anywhere... and expensive when you do.

Read speed is the same as on macOS, around 225 MB/sec:

2 5g copy 250 mb-ps read

I'm not sure why on macOS, writes are so flaky, while reads (from the Pi to the Mac) are consistent and on par with the Windows machine...

geerlingguy commented 3 months ago

More thoughts: macOS Finder is still bad at network file copies

geerlingguy commented 3 months ago

A few people in the video comments (https://www.youtube.com/watch?v=l30sADfDiM8) suggested testing a USB 3 2.5G network adapter instead of the PCIe switch + 2.5G HAT... I would like to try that and see if I can get maximum write performance on here.

So here's my new setup:

DSC05482

...however, it seems like even though in dmesg I see r8152 SuperSpeed USB device Realtek USB 10/100/1G/2.5G LAN being recognized and intialized (load rtl8156a-2 v2 04/27/23 successfully), and I see the activity light blinking on the dongle, I am not getting a network connection (ip a shows end0, eth0, and wlan0...).

geerlingguy commented 3 months ago

Ah, because I have OMV installed, I had to run sudo omv-firstaid and enable the interface. Somehow it grabbed eth0 from the Pi's internal port, which seems to be why the networking stack was all confused.

I ran through the firstaid wizard, and now I'm getting an IP address and connection on the Plugable USB 2.5G adapter.

pi@pi-nas:~ $ ethtool eth0
Settings for eth0:
    Supported ports: [ TP    MII ]
    Supported link modes:   10baseT/Half 10baseT/Full
                            100baseT/Half 100baseT/Full
                            1000baseT/Half 1000baseT/Full
                            2500baseT/Full
    Supported pause frame use: No
    Supports auto-negotiation: Yes
    Supported FEC modes: Not reported
    Advertised link modes:  10baseT/Half 10baseT/Full
                            100baseT/Half 100baseT/Full
                            1000baseT/Full
                            2500baseT/Full
    Advertised pause frame use: No
    Advertised auto-negotiation: Yes
    Advertised FEC modes: Not reported
    Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                         100baseT/Half 100baseT/Full
                                         1000baseT/Full
                                         2500baseT/Full
    Link partner advertised pause frame use: Symmetric Receive-only
    Link partner advertised auto-negotiation: Yes
    Link partner advertised FEC modes: Not reported
    Speed: 2500Mb/s
    Duplex: Full
    Auto-negotiation: on
    Port: MII
    PHYAD: 32
    Transceiver: internal
netlink error: Operation not permitted
        Current message level: 0x00007fff (32767)
                               drv probe link timer ifdown ifup rx_err tx_err tx_queued intr tx_done rx_status pktdata hw wol
    Link detected: yes

Testing with iperf3:

pi@pi-nas:~ $ iperf3 --bidir -c 10.0.2.15
Connecting to host 10.0.2.15, port 5201
[  5] local 10.0.2.218 port 55412 connected to 10.0.2.15 port 5201
[  7] local 10.0.2.218 port 55414 connected to 10.0.2.15 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec   218 MBytes  1.83 Gbits/sec    0    587 KBytes       
[  7][RX-C]   0.00-1.00   sec  45.8 MBytes   384 Mbits/sec                  
[  5][TX-C]   1.00-2.00   sec   214 MBytes  1.80 Gbits/sec    0    587 KBytes       
[  7][RX-C]   1.00-2.00   sec  31.5 MBytes   264 Mbits/sec                  
[  5][TX-C]   2.00-3.00   sec   215 MBytes  1.80 Gbits/sec    0    587 KBytes       
[  7][RX-C]   2.00-3.00   sec  35.9 MBytes   301 Mbits/sec                  
[  5][TX-C]   3.00-4.00   sec   215 MBytes  1.80 Gbits/sec    0    587 KBytes       
[  7][RX-C]   3.00-4.00   sec  34.2 MBytes   287 Mbits/sec                  
[  5][TX-C]   4.00-5.00   sec   213 MBytes  1.79 Gbits/sec    0    587 KBytes       
[  7][RX-C]   4.00-5.00   sec  30.4 MBytes   255 Mbits/sec                  
[  5][TX-C]   5.00-6.00   sec   215 MBytes  1.80 Gbits/sec    0    587 KBytes       
[  7][RX-C]   5.00-6.00   sec  33.4 MBytes   280 Mbits/sec                  
[  5][TX-C]   6.00-7.00   sec   214 MBytes  1.80 Gbits/sec    0    587 KBytes       
[  7][RX-C]   6.00-7.00   sec  32.5 MBytes   272 Mbits/sec                  
[  5][TX-C]   7.00-8.00   sec   215 MBytes  1.81 Gbits/sec    0    587 KBytes       
[  7][RX-C]   7.00-8.00   sec  35.3 MBytes   296 Mbits/sec                  
[  5][TX-C]   8.00-9.00   sec   214 MBytes  1.80 Gbits/sec    0    587 KBytes       
[  7][RX-C]   8.00-9.00   sec  32.5 MBytes   272 Mbits/sec                  
[  5][TX-C]   9.00-10.00  sec   215 MBytes  1.80 Gbits/sec    0    587 KBytes       
[  7][RX-C]   9.00-10.00  sec  38.0 MBytes   319 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  2.10 GBytes  1.80 Gbits/sec    0             sender
[  5][TX-C]   0.00-10.00  sec  2.10 GBytes  1.80 Gbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec   349 MBytes   293 Mbits/sec                  sender
[  7][RX-C]   0.00-10.00  sec   349 MBytes   293 Mbits/sec                  receiver

iperf Done.

And here's a file copy to the Pi 5 over the 2.5G connection from Windows 11 (average of 270 MB/sec):

2 5g copy 270 mb-ps write wow

And here's a file copy from the Pi 5 over the 2.5G connection to Windows 11 (average of 200 MB/sec):

2 5g copy 200mb-ps read wow

It seemed like the write was not really bottlenecked, but the read was bottlenecked on disk IO, of all things, according to atop. For some reason the drives were pegged and at 115% utilization, and I saw ksoftirqd rising up in the task list. Maybe an issue where all the IO is going through one CPU core? I noticed the IRQs were pegged to core0.

ThomasKaiser commented 3 months ago

Maybe an issue where all the IO is going through one CPU core? I noticed the IRQs were pegged to core0.

Hopefully soon to be resolved: https://github.com/raspberrypi/linux/issues/6077

And then someone needs to take time/efforts to develop sane IRQ affinity settings (like mostly I did for Armbian ages ago)

justinclift commented 3 months ago

For some reason the drives were pegged and at 115% utilization

Maybe some kind of parity calculation?

belag commented 3 months ago

It seems to be out of stock everywhere? Any ideas who might have them for sale?

geerlingguy commented 3 months ago

I was told Arace had limited stock and is out, hopefully they will get a new shipment in soon...

geerlingguy commented 3 months ago

Someone had suggested to also try one of the video editing benchmarking tools; in this case AJA System Test Lite, running it with 5120x2700 5K RED, 16GB, 10bit RGB, uses 52.75 MB IO size, to see if that will max out the write speed. I have also tested with Blackmagic Disk Speed Test in the past. Could see if different media/copy types (besides a straight macOS Finder copy) behave differently.

ThomasKaiser commented 3 months ago

Could see if different media/copy types (besides a straight macOS Finder copy) behave differently.

The 'problem' with Finder is implementing hidden optimization strategies (3rd time the same link in the same issue: https://www.helios.de/web/EN/support/TI/157.html). As such testing with LanTest with 'Backbone networks, e.g. 40 Gigabit Ethernet' gives more reliable numbers.

geerlingguy commented 3 months ago

@ThomasKaiser - I understand that, but from that doc, it seems to indicate Finder should be more optimized for the types of copies I'm performing, whereas my experience seems to indicate something is seriously wrong with the SMB implementation on the latest macOS releases... or with some server/client negotiation. It's crazy (to me) that iperf, SFTP, and other more direct methods get line speed no issue, but Finder copies and cp/rsync/etc. using the SMB mount are so shaky and slower.

In this case, I'm actually less interested in the theoretical, and more interested in what's causing the inconsistency with real world use cases (I do a lot of project copying, and faster total copy time for 30-90 GB folders is better for me).

justinclift commented 3 months ago

To investigate the Finder problem from another angle, maybe something like macOS's equivalent to strace?

https://www.shogan.co.uk/devops/dtrace-dtruss-an-alternative-to-strace-on-macos/

belag commented 3 months ago

I was told Arace had limited stock and is out, hopefully they will get a new shipment in soon...

Thank you!

teodoryantcheff commented 3 months ago

I was told Arace had limited stock and is out, hopefully they will get a new shipment in soon...

Thank you, @geerlingguy !

jessepeterson commented 3 months ago

This is exciting. It looks like the JMB585 supports SATA port multipliers. Do you have any to test with? 20 drive arrays seem possible.

I could imagine a mini-ITX NAS case with 5.25" drive cages/backplanes all wired up. You'd just need a power supply with a physical switch and molex+SATA power connectors. And some sort of mini-ITX Pi mounting bracket, of course.

dragonfly-net commented 2 months ago

This is exciting. It looks like the JMB585 supports SATA port multipliers. Do you have any to test with? 20 drive arrays seem possible.

I could imagine a mini-ITX NAS case with 5.25" drive cages/backplanes all wired up. You'd just need a power supply with a physical switch and molex+SATA power connectors. And some sort of mini-ITX Pi mounting bracket, of course.

maybe check with eSATA port? But, i think speed will be low... maybe for magnetic hdd, but not SSD.

Ramog commented 2 months ago

I was told Arace had limited stock and is out, hopefully they will get a new shipment in soon...

Sad I hope this happens soon this seems like the perfect project for me, already got a pi 5, now the Penta SATA HAT would be next. Even for just having a device to copy and move files between disks and what not this would be encredibly useful.

teodoryantcheff commented 2 months ago

Arace have it in stock now. I just ordered two from https://arace.tech/products/radxa-penta-sata-hat-up-to-5x-sata-disks-hat-for-raspberry-pi-5

robson-paproski commented 2 months ago

Question, this hat is compatible with OrangePi 5?

Riverside96 commented 2 months ago

@geerlingguy have you experimented with hibernation at all? I have ordered one regardless, but I don't seem to see an mention of SATA-3.3 hibernation support in the documentation. My use-case would be 3 hdd's with a small mirrored dir with periodic backup. Keeping them running would not be an option for me. I can't seem to ascertain this on the discord server & would like to order the drives.

ThomasKaiser commented 2 months ago

@geerlingguy since we were talking about Finder weirdness and I'm currently testing SMB multichannel between a MacBook and Rock 5 ITX... just did a quick test with three files 2.3 GB in size on a Samba 4.13 share with server multi channel support = yes:

Samba -> Mac constant +500 MB/s:

multichannel-read

Mac -> Samba very flaky numbers, short bursts at +450 MB/s but mostly nothing and the Finder waiting for whatever:

multichannel-write

But note that the network setup is somewhat broken anyway in direction to Rock 5 ITX so my Finder investigations need to be revisited once that is resolved.

geerlingguy commented 2 months ago

@ThomasKaiser - Thanks for posting that, and that is definitely my experience (though usually not that much of a blip where there's no writing. Definitely something weird, and watching the Console log on the Mac is almost useless :P

ThomasKaiser commented 2 months ago

And one last word about macOS Finder: I seem to have resolved storage problems by creating a FrankenRAID (mdraid-0 out of 4 SATA SSDs and one really crappy NVMe SSD) and am now getting with SMB Multichannel rather consistent 600 MB/s in Finder in both directions:

rock5-itx-finder-copy-multichannel

Full story: https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ROCK_5_ITX.md#smb-multichannel

So in case you're revisiting the issues you ran into I would strongly recommend to let an iostat 10 running in the background checking for %iowait. Yesterday when writing to the RK3588 thingy with only the crappy NVMe SSD as storage device in TX direction %iowait went up to 10% or even more. With the FrankenRAID everything is fine.

cmonty14 commented 2 months ago

Some usage notes:

  • I had to add dtparam=pciex1 to the /boot/firmware/config.txt to get the HAT to be recognized
  • I also could run it at PCIe Gen 3.0 speeds with dtparam=pciex1_gen=3
  • To get the HAT to fit on top of the Pi 5 with an active cooler, I had to use needle-nose pliers to break off the tops of the three heat sink fins in the corner closest to the Pi's USB-C port. Otherwise the barrel jack would hit the tops of those heat sink fins, and not make full contact with the GPIO pins
  • I could get 800+ MB/sec at Gen 3.0 speeds with an array of four Samsung 8TB QVO SSDs
  • I could get 74 MB/sec writing to a RAIDZ1 array over Samba (using OMV)
  • I could get 97 MB/sec writing to a RAID 0 array over Samba (using bare Linux)
  • I could get 122 MB/sec reading from either array over Samba on the Pi's built-in 1 Gbps network interface
  • I could get 240 MB/sec reading from either array over Samba on a HatNET! 2.5G adapter from Pineberry Pi (this was plugged into a HatBRICK! Commander PCIe Gen 2.0 switch, which had one port to the 2.5G HAT, and one to the Radxa Penta SATA HAT
  • Idle power consumption for the setup with just the Penta SATA HAT was 6W
  • Idle power consumption for the setup including the PCIe switch and 2.5G NIC was 8W
  • Power consumption during disk read/write operations over the network was between 8-16W
  • Peak power consumption while ZFS was doing some sort of cleanup operation or compression was 24W

Hi Jeff, could you please share some information of the tool you're using for IO benchmark? I think I have identified fio, but I assume you have used some kind of "wrapper-script" to run a series of qualified benchmarks.

Regards Thomas