Open geerlingguy opened 3 months ago
could you please share some information of the tool you're using for IO benchmark?
He's using https://raw.githubusercontent.com/geerlingguy/pi-cluster/master/benchmarks/disk-benchmark.sh to be called as explained in any of his sbc-review issues, e.g. this
There are at least three problems with this script, one being a major one:
fio
and 4 jobs in parallel while sequential write performance will be measured with iozone
in a different fashion and as such both numbers don't match (just tested: when fio
reports 85 MB/s sequential reads with 4 concurrent jobs, iozone
will measure just ~75 MB/s with a single job). fio
does allow for non-destructive write testings with also 4 concurrent jobs (which BTW is a synthentic benchmark scenario not matching real-world situations of SBC users), no idea why Jeff doesn't switch to both fio
for writes (creating another unrealistic number) or to iozone
for both numbers (all that's needed is another -i 1
added to command line)iozone
tests will not show real performance on many devicesdisk-benchmark.sh
is in reality disk-settings-benchmark.sh
since it trusts in whatever (stupid) settings the OS image is running with.To talk about disk performance a switch to performance
governor would be needed prior to execution [1]
Quick test on a Rock 5 ITX with an 256 GB EVO Plus A2 SD card comparing three different settings:
performance
(this represents 'storage performance w/o settings involved'):
READ: bw=87.2MiB/s (91.4MB/s), 87.2MiB/s-87.2MiB/s (91.4MB/s-91.4MB/s), io=999MiB (1048MB), run=11459-11459msec
random random
kB reclen write rewrite read reread read write
102400 4 2848 2924 12238 2971
102400 1024 62283 62087 77176 61358
In contrast Radxa's defaults since 2022 and Armbian defaults until 2024: ondemand
with io_is_busy=1
:
READ: bw=81.4MiB/s (85.4MB/s), 81.4MiB/s-81.4MiB/s (85.4MB/s-85.4MB/s), io=935MiB (980MB), run=11482-11482msec
random random
kB reclen write rewrite read reread read write
102400 4 2838 2940 11663 2921
102400 1024 60790 62549 77639 60492
We see small drops in performance everywhere and also a bit of results variation since 2940 KB/s with ondemand
compared to 2924 KB/s with performance
can't be the result of settings since no other governor can 'outperform' performance
:
Retesting with schedutil
since being the new Armbian default from 2024 on and also what many SBC vendors might be using since for their OS images they usually don't think a single second about kernel config but just ship with the Android kernel the SoC vendor has thrown at them:
READ: bw=85.1MiB/s (89.3MB/s), 85.1MiB/s-85.1MiB/s (89.3MB/s-89.3MB/s), io=978MiB (1026MB), run=11490-11490msec
random random
kB reclen write rewrite read reread read write
102400 4 2062 2193 8973 2165
102400 1024 54671 53655 61013 54159
Compared to ondemand
with the respective tweaks the important 4K performance dropped by 25%, with larger block sizes it's not that drastic and the fio
test with the unrealistic 4 concurrent read jobs even improves (but since we haven't measured at least 3 times we have no idea whether these different numbers are due to different settings or more probably: 'results variation'. Running a benchmark only once is almost always wrong, it has to be repeated at least three times, then standard deviation has to be calculated and if too high either more measurements or results go into trash).
But what these synthetic benchmarks don't tell anyway: real-world storage performance that is easily halved by the switch to schedutil
since unlike benchmarks with continuous storage access where the cpufreq driver has a chance to ramp up clockspeeds in real-world situations the clockspeeds will remain low when only short I/O accesses happen. That's what you get when you switch a central setting without any evaluation and obiously 'just for fun' :)
At least it should be obvious that disk-benchmark.sh
is not able to report about disk performance but only 'disk performance tampered by some default settings' in its current form.
One might argue using 'OS defaults' would be the right thing since that's what they ship and users have to live with but me as someone who only does 'active reviews' (not just reporting numbers but improving numbers) can't disagree more since the best idea is to run the test in both modes: OS image defaults vs. performance
and then pointing the OS image makers at the difference and hint at how to fix this (worked all the time, just not with the Banana Pi and Armbian guys).
[1] for Cluster in /sys/devices/system/cpu/cpufreq/policy* ; do [[ -e "${Cluster}" ]] || break; echo performance >"${Cluster}/scaling_governor"; done
@ThomasKaiser - To properly benchmark storage solutions, you need to do a lot more than I think either of us do in a typical review cycle for a storage device.
In my case, when it actually matters, I will test across different OSes with 100+ GB files, with folders with 5,000+ small files, and average the eventual total time for the copy back and forth.
The disk-benchmark.sh script is a quick way to get an 'with the default OS image, in ideal circumstances, with smaller files, here's the kind of performance one can expect'. There are huge differences depending on if you use ext3/ext4, ZFS, Btrfs, Debian, Ubuntu, a board vendor's custom distro, performance
or ondemand
governors (which can change behavior even depending on the distro / image you might be using). It's a fool's game making definitive statements based on any single benchmarks, which is why I only use the disk-benchmarks.sh
script for a quick "here's what this board does" thing.
And I do think it's useful to not sit there tweaking and tuning the default distro image for best performance, because I want my tests to reflect what most users will see. If they buy a Raspberry Pi, they will go to the docs and see they should flash Pi OS to the card using Imager.
The docs don't mention setting performance
, so I don't do that in my "official" benchmarks. I follow the vendor guides as close as possible, and if their own images are poorly optimized, that's not a 'me' problem. And I'm happy to re-run the entire gauntlet if a vendor reaches out, like Turing Pi did with the RK1.
@geerlingguy doesn't change anything wrt different testing methodology for sequential reads and writes. In case you accept https://github.com/geerlingguy/pi-cluster/pull/12 this will become obvious with future testings and then you might decide to adjust your reporting or not :)
True; honestly my main concern is to have a few different tests since I know many people just throw hdparm at it and call it a day. I like fio
and iozone
a lot better, though I have yet to find a way to test all aspects of ZFS filesystems in a way that ZFS caching doesn't interfere (I wish there were a way to tell ZFS 'fill all caches, then run the test', instead of having to copy across tens of GB of files before starting to get more useful data).
True; honestly my main concern is to have a few different tests since I know many people just throw hdparm at it and call it a day.
Correct, that's the garbage the majority of 'Linux tech youtubers' rely on. Ignoring (not knowing) that hdparm
uses 128K block size which was huge when it was hardcoded (last century) but is a joke today.
I like
fio
andiozone
a lot better
Both are storage performance tests unlike hdparm
(which's benchmarking capabilites were a tool for kernel developers 30 years ago when only spinning rust existed attached by dog slow interfaces)
though I have yet to find a way to test all aspects of ZFS filesystems in a way that ZFS caching doesn't interfere
Simple solution: avoid ZFS for benchmarks and try to educate your target audience about the ZFS benefits (spoiler alert: they don't want this content ;) )
re: ZFS: Avoiding it is impossible if you want to show people what kind of performance you get on a modern NAS, since it seems like half the homelab world is focused on ZFS, and the other half a split between old school RAID (mdadm), Btrfs, and all the weird unholy things proprietary vendors cobble together (like Unraid).
Also, if you don't mention ZFS when talking about storage, you end up with so many random comments about 'why not ZFS', it's the modern homelab equivalent to 'btw I use Arch' or 'why don't you use [vim|emacs|nano]?' :D
Unavoidable, unfortunately!
Anyway, I plan on deploying this HAT as a replica target for my main ZFS array... we'll see how that works out! Still looking to find a case for it. Too lazy to CAD my own heh
If you still have this on-hand and would be willing to make a few measurements... How thick of a 2.5" drive can be mounted directly to the hat? Modern 2.5" SSDs are typically 7mm thick, while (high-capacity) 2.5" HDDs (lower speed but much cheaper per-TB) usually come in at 15mm thick. Are those too fat to stack all 4 slots?
@axiopaladin
And
rsync
was also highly consistent, but slower still, at like 27 MB/sec (using-avz
):
Just as with Finder, rsync in stock macOS is kind of old and kind of trash:
% /usr/bin/rsync --version
rsync version 2.6.9 protocol version 29
Copyright (C) 1996-2006 by Andrew Tridgell, Wayne Davison, and others.
It's old enough to vote! Which transfers a random 3.5G folder off my Desktop in 2 minutes and 38 seconds:
/usr/bin/rsync -avz source_folder /Volumes/home/test1 105.90s user 4.97s system 70% cpu 2:38.20 total
homebrew provides a much newer version:
% /opt/homebrew/bin/rsync --version
rsync version 3.2.7 protocol version 31
Copyright (C) 1996-2022 by Andrew Tridgell, Wayne Davison, and others.
Which transfers the same folder off my Desktop in 1 minute and 23 seconds:
/opt/homebrew/bin/rsync -avz source_folder /Volumes/home/test2 18.36s user 8.67s system 32% cpu 1:23.53 total
I mean, that's still only like half of 1G line speed here, but better than 27M/s. I think I ended up having to use something like Get Backup Pro to really get the most performance out of my longer synchronization tasks - it's only another 10 seconds faster in this 3.5G test, but that compounds over hundreds of gigs of data.
/usr/bin/rsync -avz source_folder /Volumes/home/test1 105.90s user 4.97s system 70% cpu 2:38.20 total
vs./opt/homebrew/bin/rsync -avz source_folder /Volumes/home/test2 18.36s user 8.67s system 32% cpu 1:23.53 total
Almost twice as fast with less than half CPU utilization hints at block sizes chosen by 1st rsync
run being a lot smaller than 2nd run.
I personally find it hard to 'measure' with tools that may adjust blocksize depending on some defaults/algorithms that have changed over time. And while rsync
has a -B
/--block-size=BLOCKSIZE
option to force a higher blocksize the upper limit is still 128K which is way too small to saturate modern networks.
Does anyone know if you can boot from a SSD on the hat? Im planing to buy one but I don
t want to use a MicroSD or USB. Regards
@lolren - Right now, no.
Also Michael Klements did a build with the Penta SATA HAT, and released four variants of his 3D printable case, which is pretty nice looking!
@geerlingguy can you verify something I'm seeing?
When plugged in with the 12v DC barrel jack -- The external ATX power supply pins are live.
@celly - Yes, it looks like there's no backfeed protection on the board (tested with my multimeter just now, also confirmed live 12V on the 12V molex pin), so the +12v DC just passes through from the molex to the barrel plug. I would recommend against plugging two power supplies into the board at the same time!
Yeah, Radxa specifically calls out that you shouldn't plug the Pi and the HAT in at the same time on their Raspberry Pi page for the product, now. I can't say if that was always there or not, though.
@geerlingguy thanks for checking. Wanted to make sure I didn't have a bad board before I plugged drives in.
@pfriedel Yeah, that makes sense. But I'd figure they would have a diode or something on it to protect it from being live. Since, if it is plugged in, even if you think the pi is off, those pins are still live, since they seem to be connected directly to mains.
On a plus side, I may take advantage of this, and use that to power a 12v fan.
Also, sorry to hijack, but this seems to be the best place for info on this right now, so one note about cases. In case anyone finds this and is looking for a case for it, the RADXA official case is not ready for primetime. I have spent the last 3 days fighting it, and each part leads to more headaches and disappointment.
The three major issues are, there is not any clearance for the PCIe cable, so to get the pi and hat in the case, you'll damage the cable. The drive holder is clever, but doesn't allow for any airflow. And, finally, it is designed for the top fan to be from their fan / OLED board, which isn't available.
Not to mention, unless you are very experienced with your 3D printer it is a very tough print, with lots of press fits with zero room for error, and also tabs and screw holes that are not meant for PLA. It really is more meant for mass production and not hobby printing.
If you need a case, I'd start with Michael Klements one from this comment until someone has a chance to tweak the official one.
Yeah, for what it's worth I think there are 3 options for connecting a fan:
And boy howdy, you want a fan if you put this in a case for the Pi if nothing else. I stuck a random heatsink from a Pi4 heatsink kit onto the SATA controller chip after chopping off two of the five fins and it just barely clears two 9.5mm drives. Is it doing any good? Who knows, but it probably isn't hurting. On the other hand, I don't think this is a new chip for Radxa, and people have been building NASes off of their hardware for a while, so maybe it just runs warm normally.
Oh, it should also be noted that 15mm drives do not fit. 9.5 is fine if tight. I don't have any 12.5mm drives to test if they fit the spacing or not, unfortunately.
Update: That header is definitely JST PHD 2x5, works like a charm, although the PWM fan I have is about as noisy at 50% as my Noctua 4010 5v is at 100%. I should have figured. And the 5v PWM 4010 is on backorder everywhere. Maybe the 4020 will fit...
@pfriedel That is all great information. Thank you so much for taking the time.
Probing the JST and the 10-pin looking for a solid 5v connection is what started all of this for me. The weird fan connector threw me for a loop -- nothing I had fit it. I saw 5v, but not sure how much power I could safely pull through it as I want to use a larger fan.
The 10-pin is interesting since it has I2C and GPIO pass thru on it along with 5v, but I didn't want to tap into that yet, since adding a cheap OLED display is too tempting.
Once I saw that there was 12v on the molex, I decided to go with a 12v 80mm ultra quiet noctua fan that can be run at 1000rpm, and a female to 3-pin molex connector. It is stupid. But it'll be the best type of stupid -- quiet stupid.
The case you designed is awesome -- really great job. I wish you had sent me that case a few hours ago before I decided to design my own.. 🤣 The one I just finished is a bit more "Server" as I decided to not expose the HDMI ports in exchange for a wider case with a larger fan. If it works, I'll share it in the next few days after the fan gets here.
Update: The 12v molex works like a dream for the fan. I published my case using it here -- https://cults3d.com/en/3d-model/gadget/pi-5-nas-tower-for-radxa-hat-with-option-noctua-fan
Radxa sells an updated version of their Penta SATA HAT for $45, and it includes four SATA drive connectors, plus one edge connector for a 5th drive, 12V power inputs (molex or barrel jack) to power both the drives and the Pi 5 via GPIO, a cable for the 5th drive, an FFC cable to connect the HAT to the Pi 5, and screws for the mounting.
It looks like the SATA controller is a JMB585 PCIe Gen 3x2 SATA controller, so it could benefit from running the Pi 5's PCIe lane at Gen 3.0 speeds (setting
dtparam=pciex1_gen=3
in/boot/firmware/config.txt
). Radxa sent me a unit for testing.