Closed geerlingguy closed 2 years ago
@vebmaster - The one I used in the video was provided by Radxa; it should hopefully be available soon on their website.
@ThomasKaiser - For the benchmark monitoring, I ran the tests in three different conditions: 1. monitoring with atop
at 2s intervals, 2. monitoring with top
in 2s intervals, and 3. not running any tool to monitor resource usage at all during the benchmark. I do that third test because even though it should be minimal, injecting the calls to get monitoring data could conceivably impact performance (no matter how minimally).
Nothing else was running during the benchmarks (I actually ran them all again without OMV installed at all) besides what is preinstalled in the lite Pi OS 64-bit image.
For the network file copy I used this rsync command, and I also set up a separate comparison (that I didn't fully document in this issue) where I did the following:
rsync
to copy the folder.And the final result between rsync and Finder were within about 1s of each other (which surprised me... it felt like rsync was slower, just watching my network graph in iStatMenus). I repeated that test twice.
I haven't done any more advanced checking of the SMB connection details—but it seems other people in the Pi community have noticed similar issues with samba file copies not being as fast as they were a year or two ago.
I can confirm that the Samba performance dropped with the kernel upgrade from 5.4 to 5.10. I did lot's of testing, going back to 5.4 always improved the performance.
@geerlingguy apologies for asking in the wrong issue, then deleting the comment here while you were already answering.
AFAIK neither atop
nor top
show actual CPU clockspeeds? Do you set the cpufreq governor always manually to performance prior to testing or do you trust in the clockspeeds being fully ramped up anyway?
And as already mentioned block sizes matter in a world where network stacks implement auto tuning of settings. You would either need a network sniffer or a tool that has control over the block size to get what you test (Helios Lantest for example).
BTW: I'm in the wrong issue since my concerns are not about 'RPi samba performance degradation' but about the measured differences between the RTD1296 thing and the Taco. Without checking details like the negotiated block size (or smbd tunables) it's hard to interpret the numbers or attribute them to 'hardware' while they're in reality just different settings.
@ThomasKaiser heh, we just keep going back and forth! I've moved my comment body over to https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/162, let's keep the discussion there since it has more of the detailed benchmarking in that issue :)
@iandk just glanced through the referenced rpi issue. The drop in SMB write performance after some seconds looks like 'buffer full' situations.
In case you can still reproduce it this way with more recent kernels than 5.4 in combination with Samba I would run on the RPi in a terminal vmstat 2
(or maybe vmstat 1
if the performance drop happens too fast) and start a SMB write effort.
free buff cache
are interesting and wa
on the right (that's %iowait).
I think the current state is, that it’s always stuck to around 70-80mb/s Write, instead of the 112mb/s stable with kernel 5.4
Well, I fired up my RPi 4B (early model with just 1GB) to check for this (5.10 kernel). No Windows involved but with macOS 12 as SMB client I get ~100 MB/s in both directions which is perfectly fine to me...
You should remove all of those cache settings, no send/ receive file, no caching. With 5.4 I was able to get a sustained 112mb/ in both directions.
You should remove all of those cache settings, no send/ receive file, no caching.
Why exactly should I do that?
The Pi 4 was (with kernel 5.4) able to saturate the 1Gbit/s port without a problem, you don't need a cache. This will just result in performance drops over time, as you transfer larger files.
Additionally lots of users have exactly the same issue, the current performance is quite a bit slower than a few months ago. Even one of the devs could confirm the problem, see my GitHub issue.
I stillt think that this is related to the kernel update 5.4 -> 5.10
Additionally lots of users have exactly the same issue, the current performance is quite a bit slower than a few months ago.
Yeah, I heard this. Jeff also said the same. And everybody refuses active benchmarking and looking into settings. You said above
it’s always stuck to around 70-80mb/s
I gave it a try and have to report: nope, it's not. I truly believe that you had more before and that 70-80 MB/s are pathetic especially given how Windows Explorer accelerates SMB transfers since Windows 7. But what's the point in using bad settings and not actively benchmarking stuff to get a clue what has changed? :)
As for "remove all of those cache settings, no send/ receive file, no caching" and "you don't need a cache. This will just result in performance drops over time, as you transfer larger files." I don't understand what you mean (but it doesn't matter that much anyway since I'm fine with the performance I get with '5.10' or as I would call it 'appropriate settings'). There's the write cache size = 524288
setting which man smb.conf
helps to explain. Nothing will slow down with 'large files' with this sort of cache 512K in size.
I did lot's of testing and benchmarking a few months ago. Tested different clients, drives, samba versions, kernel versions, usb enclosures.... etc.
But again, there's no reason to use a cache, as it was always running perfectly fine and hitting the Gbit limits without one.
rsync
that ships with macOS is absurdly slow (it is a bit older than the latest, too). Finder copies were going like 4x faster than CLI copies withrsync
.
Have you had a look with e.g. tshark
with which chunk size data has been transmitted with either version? IIRC older rsync
versions used something as low as 700 bytes while newer ones go up to 128K or something like this...
I have a Radxa Taco and I would like to put it in a case. Since no official case exists yet, my idea is to buy a small PC case made for mATX or ITX mobo NAS, and use extender cables to connect the drives from the bays to the Radxa Taco.
Do the two holes in the Radxa Taco board line up so that I can screw the board in place inside of a case made for mATX or ITX?
@geerlingguy Thanks for the videos about the Radxa Taco!
I have mine fitted with a CM4 and it is running very stable so far, however I have had problems with the screws supplied for the CPU cooling being too short and I can't get the RTC to work.
Has anyone got the RTC chip to work? There is also a question from another user in the forum that has been unanswered since February and I have not been able to address the chip either.
I'm late to this discussion but will we ever be able to boot from M.2 NMVe on the Radxa Taco?
See original issue: https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/202
I have a Taco (well, the Penta main board that goes inside) and would like to do some testing on it; run some benchmarks, test compiling ZFS, etc.
Things to test: