Closed geerlingguy closed 2 years ago
Thank you @geerlingguy. Small typo in the company name. Should be "Axzez". :) I'm sure that won't be the last time that happens, lol.
I'm here to answer questions if there are any. I have tried to post our board to reddit /r/raspberry_pi, /r/homelab and /r/DataHoarder, but our Axzez account is too new, so I'm just waiting to help over there.
@shanzez - Oops! Updated the name :)
@shanzez have you done some thermal measurements of the JMB585's surface when the controller is really busy for a long time? E.g. resilvering a RAIDz?
@geerlingguy I would suggest testing with 'checksummed' filesystems like ZFS/btrfs since data corruption can be spotted this way (a scrub
will tell). Having seen overheating SATA PMs and controllers resulting in silent data corruption this is at least something that should be checked for if the controller is missing a heatsink and partially covered by the CM4...
@ThomasKaiser - Heh... maybe apply thermal compound between the JMB585 and the bottom of the Pi. Boom, Pi's your heatsink!
@geerlingguy Axzez revealed on cnx-software.com in the meantime that heat will be dissipated into the ground plane and that it should be possible by a thermal pad or something like this to further dissipate heat away from the PCB on the bottom side. Which actually is really good news :)
@shanzez have you done some thermal measurements of the JMB585's surface when the controller is really busy for a long time? E.g. resilvering a RAIDz?
Thank you for the suggestion @ThomasKaiser. We have not done this. We have been monitoring IC temps for months and have performed some stress testing. If you have any recommended/documented processes for this, please feel free to share. I appreciate your feedback.
@shanzez I talked to the colleague having reported about JMB585 overheating issues a while ago. It occured in a rather densely packed server with an 2280 JMB585 card in an M.2 slot without any air circulation around. So four times the PCIe bandwidth (Gen3 x2 vs. Gen2 x1 here), more beefy CPU and a RAIDz2 array that showed scrub
errors every month until mounting a heatsink on the controller and redirecting internal fan's air flow.
So most probably the single Gen2 lane already bottlenecks the JMB585 sufficiently and also the CPU implementation [1].
But to test for corner cases it might help to create a btrfs raid0 arrray over 5 SSDs (mkfs-btrfs -d raid0 /dev/sd?
), then run iozone
or fio
sequential traffic tests continously for a day or two while filling the array also with random data until only the storage benchmarks have enough room to finish. And then run a btrfs scrub
to check for corrupted data.
Given the array is mounted at /mnt
for example to create 500 files 1GB in size filled with random data, this would be sufficient: for i in {1..500} ; do dd if=/dev/urandom of=$(mktemp /mnt/junk.XXXXXX) count=100 bs=10M ; done
(/dev/(u)random
on the Pi 4 is rather slow)
[1] This is a Pi 4 @ 1.8 GHz:
raid6: using algorithm neonx4 gen() 5340 MB/s
raid6: .... xor() 4005 MB/s, rmw enabled
raid6: using neon recovery algorithm
xor: measuring software checksum speed
8regs : 7588 MB/sec
32regs : 8734 MB/sec
arm64_neon : 7372 MB/sec
xor: using function: 32regs (8734 MB/sec)
And this is above server able to make use of ultra wide vector extensions though we skip 'classic'/anachronisitic (md)raid and rely also on the chipset's Quick Assist capabilities for accelerated ZFS/RAIDz functions:
raid6: using algorithm avx512x4 gen() 34962 MB/s
raid6: .... xor() 20023 MB/s, rmw enabled
raid6: using avx512x2 recovery algorithm
xor: automatically using best checksumming function avx
@shanzez out of curiosity: why did you choose the JMB585 and not an ASM1166 to get one more SATA port? Performance/limitations are the same but this does not matter with CM4 anyway due to the single Gen2 lane bottleneck.
@shanzez I talked to the colleague having reported about JMB585 overheating issues a while ago. It occured in a rather densely packed server with an 2280 JMB585 card in an M.2 slot without any air circulation around. So four times the PCIe bandwidth (Gen3 x2 vs. Gen2 x1 here), more beefy CPU and a RAIDz2 array that showed
scrub
errors every month until mounting a heatsink on the controller and redirecting internal fan's air flow.
Thank you very much @ThomasKaiser. I agree with your assumption that the single Gen2 lane does throttle the JMB585 enough. I've been stress testing two RAID setups (2 drives each) in parallel on one system and the temp isn't moving a bit. I'll see what I can do with your suggestions and my business partner who is more informed in this area than I am, and we'll report back.
As far as the JMB585, when we started this project we were in the middle of the chip shortage. Parts were scarce. We focused on parts we could find so we'd actually be able to produce the board and sell it. Of course the chip shortage still affects prices, but at least we can find everything. We also knew that the JMB585 already had success (we follow Jeff G. of course), so we knew we could get the parts and we could get it working with the CM4. I don't know if my business partner looked at the ASM1166 IC, but I'll ask. We were also satisfied that 5 SATA ports would be a good fit for our plans, which are more than just this board.
I agree with your assumption that the single Gen2 lane does throttle the JMB585 enough
But that's just an assumption and once either Radxa or Pine64 folks will come up with an RK3568 based CM variant we're talking about doubled data throughput (PCIe Gen3 uses 8 GT/s vs. 5 GT/s with Gen2 but due to the way more efficient 128b/130b coding with Gen3 the real bandwidth is almost twice as high).
Wrt ASM1166 I just searched this repo and... surprisingly nothing. So no CM4 experiences but "PCIe on ARM" is nothing new and it's just missing AHCI/SATA kernel support that might block this chip. Jeff, those cards should be between 35-55 bucks on Ali or Amazon warehouse so once your time and patreons permit... :)
more than just this board
Talking about future products I would really love to see full power control by GPIO pins since this would allow for staggered spin-up of connected disks fully controlled by the CM. Right now five 3.5" disks ask for an oversized ATX PSU able to provide stable 12V at 100W since all platters spin up at the same time. Those ATX PSUs are pretty inefficient in low load conditions AKA 'everything that happens after disks have spun up'.
With staggered spin-up and a more sophisticated DC-DC circuitry it might be possible to power the whole setup with a 90W laptop power brick. But it's challenging since if all drives are allowed to enter standby/sleep state and wake up simultaneously it's brownout time. Ok, that's nothing for a 'consumer product' but more for an appliance with carefully chosen low idle consumption HDDs that are blocked from entering sleep/standby :)
We focused on parts we could find so we'd actually be able to produce the board and sell it.
I understand and like this approach, at the same time it seems to contradict with
We were also satisfied that 5 SATA ports would be a good fit for our plans, which are more than just this board.
IO bandwidth of each Raspberry Pi SOC generation has multiplied by a factor of 5 - 10, therefore as @ThomasKaiser pointed out, each new Compute Modul will challenge assumption. From my view the CM4 form factor established a kind of new "processor socket specification" for ARM based computing. A modular IO board using established connection standards (i.e. PCIE) would be the answer we are looking for. Seaberry Pi CM4 Carrier Board is following this approach but went a bit over the edge with the resulting price. Something in the 150-170 $/€ range would be perfect.
@shanzez could you please publish the physical board size? Its not mentioned in the product data sheet
each new Compute Modul will challenge assumption
We'll see. The JMB585 and the ASM1166 are both Gen3 x2 devices while the CM4 interconnector just exposes a single PCIe lane and existing designs only care for Gen2 speeds. Then there's link training as an essential part of the PCIe specs so maybe any module wanting to drive the bus at 8GT/s will simply end up falling back to Gen2 speed anyway (this will result in a boot delay which then will be 'fixed' with a device-tree node limiting speed to Gen2 in the first place and maybe even link-width to x1 on a CM with a RK3568 CM to avoid probing delays).
Anyway: I would still suggest testing for data corruption under constant load since it's easy/quick to set up and only requires some time letting the device run unattended to finally check with a scrub
afterwards. This could also be done with a classical/anachronistic mdraid RAID5/6 (mdraid-1 won't do since too dumb of an implementation) since parity RAID also allows to check for data corruption issues (echo check > /sys/block/mdX/md/sync_action
and then watch -n 600 /proc/mdstat
). But I fear BCM2711 is simply too weak to generate an appropriate bandwidth this way.
IO bandwidth of each SOC generation has multiplied by a factor of 5 - 10
You're talking about RPi 1/2/3 compared to RPi 4[00]? Since in general that's not necessarily true. Older ARM SoCs are missing here in my list but ARM SoCs made for the NAS use case from e.g. Marvell had multiple GbE and SATA ports already a decade ago and a multi-purpose ARM SoC designed in 2015 like the RK3399 has Gen2 x4 and is actually able to saturate all 4 lanes at the same time (we measured +1.4GB/sec with NVMe SSDs).
staggered spin-up of connected disks fully controlled by the CM
For non-RAID implementations or if you are not using all SATA ports in RAID, you should be able to use hdparm to set the drive to power up in standby. Then, when a drive is first accessed, it should spin up automatically. You might also be able to still use hdparm and script starting up the RAID so that you could stagger the drives spinning up. Because we boot early from either USB or eMMC, we should be able to control the hard drives and bring them up however we want.
Right now five 3.5" disks ask for an oversized ATX PSU able to provide stable 12V at 100W since all platters spin up at the same time.
We only support 20/24-pin ATX PSUs.
@shanzez could you please publish the physical board size? Its not mentioned in the product data sheet
The dimensions for the Interceptor Carrier Board are 100mm L x 110mm W. FYI, we have a Mini ITX board adapter on the way. It is in the mail to us now and could arrive any time. I have already seen the pictures of it. This should allow our board to be installed in any case that takes a Mini ITX motherboard. Once I get it and test it out, it'll go up on our store.
you should be able to use hdparm to set the drive to power up in standby
Nope since the kernel will spin all disks up anyway so without driver patching the only way to get staggered spin-up in such a setup is power control or cutting/connecting SATA data lines (then features like PUIS – power up in standby – might work).
you should be able to use hdparm to set the drive to power up in standby
Nope since the kernel will spin all disks up anyway so without driver patching the only way to get staggered spin-up in such a setup is power control or cutting/connecting SATA data lines (then features like PUIS – power up in standby – might work).
Yep, you are correct. We just tested this. We've added the driver patch to our to-do list.
@shanzez - Regarding
We only support 20/24-pin ATX PSUs.
Note that many people I think would probably opt for a PicoPSU (here's the one I'm using), and couple that with a 12V barrel plug power supply. I've been using an 8A PSU (96W) with mine on the Taco, and that was enough to spin up 4 3.5" Seagate HDDs, but just barely.
You can put through something like 10 or 12A in most PicoPSUs, though once you start reaching those heights, it might be worth considering a full-on SFF PSU (which would also provide plenty of juice for other peripherals, potentially).
We've added the driver patch to our to-do list.
Will still only work with some HDDs :)
might be worth considering a full-on SFF PSU
And then energy (in)efficiency will be on par with a standard PC that also uses an oversized PSU. Avoiding a PSU capable of too much juice is one of the most important things when lowering consumption and I personally think there's not much reason to use something that limited like an RPi or CM4 if overall consumption doesn't differ significantly from more powerful platforms.
And that's my whole point with staggered spin-up since without this feature a bunch of spinning rust will always force the PSU being overkill (and as such inefficient). Even 'nasty' SATA disks like older Seagate Barracuda will usually not exceed 10W in worst-case scenarios like resilvering an older RAIDz but when they all spin up at the same time you need 25-30W per disk (20-25W alone on the 12V rail).
But maybe it's just me (being sort of an energy efficiency fetishist – at least this desire to drive a bunch of spinning rust more energy efficient started my ARM/SBC journey almost a decade ago)
@ThomasKaiser - It really depends on the PSU you're using. Measuring the (admittedly expensive) Corsair SF600 vs the Pico PSU and 8A adapter I was using, the Corsair actually used less power at idle (10-20W) and during spinup (50-90W) when testing with 4 HDDs and a CM4. (It's also nice that it doesn't spin the fan on the PSU if not needed—they make some good little units!).
For many PC PSUs, though, especially when you're grabbing used or from a bargain bin, the efficiency is off the charts bad. But that's not too different than cheap junk laptop bricks too ;)
@geerlingguy may I suggest a reality check to bring us back on the same page? A Corsair SF600 is available as 80+ Platinum for min ~ 100€ or as 80+ Gold for min ~112€. How many would be willing to buy this PSU for a 100€ board? I'd guess a maximum of 3%, likely less than 1%
Energy efficiency is the main reason for going the ARM route for most buyers (at least in Europe).
@mi-hol - My point isn't so much that an expensive power supply can be more efficient, but only that efficiency isn't directly correlated to how much wattage a power supply can supply, or the type of power supply in general.
Yes, generally speaking, the more electronics, the more consumption—however there are many PSUs in the 20-100W range that are either horribly inefficient or extremely efficient (and same can be said for all ranges), to the point where I've seen little wall warts that burn up half their energy use as heat, and other large power supplies that waste less power than a little wall wart uses in idle draw...
Just mentioning that the rule of thumb that you can't use beefier supplies if you want energy efficiency (and sometimes price efficiency... you use what you have sometimes) is not true in many cases.
Measuring the (admittedly expensive) Corsair SF600 vs the Pico PSU and 8A adapter I was using, the Corsair actually used less power at idle (10-20W) and during spinup (50-90W)
Simple conclusion: your Pico PSU is junk :)
The SF600 (like all those PSUs rated for +300W out there) is only efficient with mid-range and high loads but not with low loads:
At 40W-50W the SF600 even if Platinum rated has only an 80% efficiency as such you need to feed the PSU with 48W-60W to get 40W-50W for CM4, controllers and spinning rust. 8W-10W wasted by 'principle' since that's inevitable or physics or simply the result of using an oversized PSU that is inefficient in low load conditions by design.
With staggered spin-up five HDDs could be fed with a 80W PSU so with some safety headroom and choosing e.g. a Meanwell LRS-100-12 (12V/8.5A) with an efficiency close to 90% with mid-range loads we're at half the watts wasted in the same 40W-50W operation range. At a fraction of costs compared to an oversized ATX PSU.
Of course all of this doesn't matter once junk PSUs are in use and as such I stay silent on this topic from now on :)
Simple conclusion: your Pico PSU is junk :)
I won't contest that, I'm just trying to make the point that principles and common sense should take a back seat to real-world test data, when it comes to efficiency.
There are also plenty of PSU manufacturers who fudge numbers very strongly and shouldn't even get a Bronze, or Gold, or Titanium, or whatever color rating they actually slap on their boxes :(
Gotta get it out of RSJ's hands before he causes a problem...
Kernel patch is available on our site (Downloads page). New OS download with ZFS support is also available on our site. We are now Debian based instead of Raspbian.
I'd so hope for a 4xSata 2xNvme 2xSFP+(10GbE) ARM board.
@trickkiste enjoy a Macchiatobin single shot in the meantime though you'll be missing one SATA port and need a PCIe card with integrated PCIe switch for your two NVMe SSDs. That board is at least somewhat affordable if you can really make use of the SFP+ ports.
Other than that you can have this with ARM server boards and SATA controller in PCIe slot but you will neither like the costs nor the consumption of such a beast :)
Now hoping for Jeff hiding our both comments or at least mine as off-topic :)
@ThomasKaiser - In one sense the comment is on-topic—I think some people see the hardware on these boards and think that they can fully utilize it, but as continuous benchmarking shows, that's just not possible. There are good use cases, for sure, but thinking you could replace a low-end X86 PC running storage over a network on a 2.5/5/10G link and competing at all in terms of how much data transfer you get... with current-generation ARM chips that's not possible for under $200+.
Hi @geerlingguy,
I am actually not hoping to run a full scale storage app (Ceph) on a Raspberry Pi C4 and concerning that matter I am very grateful for the various benchmarks you ran on the various boards.
However, besides Raspberry Pi type of "Maker" hardware, I am constantly on the outlook for more potent ARM based HW suitable for storage solutions (primarily Ceph). The architecture per se (ARM) should not constitute a bottle neck. As we can see, Apple is very successfully making use of it in high performance applications and the various NAS vendors are making use of it as well.
Since playing around with such systems is your passion, I thought I ask you, if you know anything suitable, 10GbE or at least 2,5GbE being a necessity. I am always a little disappointed that some vendors come up with half way decent boards, but then lack the "professional" side of thing, so you could really make use of the HW in storage clusters.
My dream would always be a low cost system, which drives just one disk (Nvme or SATA) and provides a high-speed 10GbE network connection. Minimal HW (no HDMI, GPIOs, ... needed) just enough to attach the disk to the network at reasonable speeds.
Or alternatively a slightly more expensive system with more Nvme or SATA IO and a at least two 10 GbE ports. IPMI availability being the cherry on top.
We store a lot of video data, which however just sits there most of the time (cold storage). RAIDS have become very inconvenient, as the sets run full quickly, can not be expanded or only at large risk and if you switch out the disks and therefore have a lot of disk sets on storage, you never know, if they ever spin up again. If you'd have an ever growing Ceph cluster, you can run scrubbing jobs, which identify faulty disks and data integrity and you can switch faulty ones out on the fly, thereby maintaining cluster health. Just to give you an idea about the use case.
I am surprised to this day, that nobody yet jumped on that train. Affordable commodity cold storage systems seem to be non-existent and the few there are, often are not available in the EU.
Cheers, Markus
with current-generation ARM chips that's not possible for under $200+
Well, it could be possible with those Marvell Armada 7K/8K SoCs as on the MacchiatoBin since while Quad Core A72 sounds similar to RPi 4 those things are something entirely different as the ARM cores inside the SoC build just the 'application processor' (AP) while they also contain one or two 'communication processors' (CP) and various accelerators.
And even the AP inside these network/NAS SoCs performs better than the BCM2711 in RPi 4 / CM4 since internal caches and memory access being much faster.
But asides the MacchiatoBin I don't know of any affordable hardware. The SoCs are used in various NAS like QNAP TS-1635AX but I've no idea whether QNAP's 'firmware' really makes use of the accelerators and what real-world performance as 'cold storage box' this thing could achieve. Me being a ZFS fanboy choose for such use cases Intel gear with as much SATA ports as possible and QAT (Intel QuickAssist Technology) support, e.g. X11DPi-N.
I am surprised to this day, that nobody yet jumped on that train.
Reason is rather simple. Majority of ARM SoCs is made for different use cases and as such lacks proper I/O support. Marvell addressed this market in the past but stopped with this for whatever reasons. Rockchip might fill the gap.
The already existing RK3568 could be an interesting candidate for such things since having a nice set of I/O interfaces (2 Gen3 PCIe lanes, PCIe Gen2 lanes, QSGMII, SATA, USB3 but all of the latter multiplexed).
But with everything ARM the software support situation is the culprit. New ARM SoCs take time for software to mature so I wouldn't expect anything to be usable prior to 2023 (and of course you would need a hardware vendor designing/producing such Ceph thingies with QSGMII for 5 Gbit/sek Ethernet and SATA or PCIe Gen3 for storage).
We are now Debian based instead of Raspbian.
The standard Raspberry Pi OS can still be used if I don't need the managed switch capabilities though, right?
The standard Raspberry Pi OS can still be used if I don't need the managed switch capabilities though, right?
Correct, if you don’t use any driver for the switch, it will just act as a normal dumb switch. Standard Raspbian OS will be fine, but you will still need a driver for the JMB585 PCIe to SATA IC. @geerlingguy has a guide on setting that up on Raspbian, I think.
@shanzez / @SuNNjek - FYI the JMB585 is supported out of the box in Raspberry Pi OS now—though you might need to do a sudo apt upgrade
before the driver is there.
@geerlingguy oh that's right. Didn't you mention that in a video a while back? Or perhaps it was a comment around here.
@geerlingguy, @ThomasKaiser, Et al. Our latest OS also includes the previously mentioned kernel patch for "potential" HDD staggered spin-up. As mentioned before, we've also added ZFS support. OS and patch are available here: https://www.axzez.com/software-downloads If you have any other suggestions do please let us know. Our hope is to provide the fully functional OS/patch at the time the boards are mailed out. Not everyone needs or wants this (looking at you two), but we do think it expands the range of people that could use the board if the OS and patch are provided with all drivers. Side note: in the final prototype for the board we shifted the JMB585 down and to the left (no longer under the CM4) and increased the size of the pad on the bottom of the board. We still haven't seen issues with the IC overheating, but thought the move was for the best.
Our latest OS also includes the previously mentioned kernel patch
Interesting. How is the kernel supposed to be updated then? You were talking about being 'Debian based' now. What does this mean exactly?
Background: I maintained for some time OS images for SBC with OpenMediaVault which got created automagically by utilizing a hook in Armbian's build system. For the RPi I had to change that a bit since Armbian didn't support the platform back then.
I combined an Armbian (Debian armhf
) userland then with kernel/firmware packages from RPi folks to ensure security fixes being available ASAP. That required some amount of apt pinning but I don't remember much details. Anyway, IMO it's important to pull in the following:
kernel patch for "potential" HDD staggered spin-up.
that's great news because it should enable to use PSUs with max of 150-200W, right? The list of tested PSU on Axzez's FAQ seems to miss models in SFX/TFX and Nano/PicoPSU form factor yet. Could that be added?
with current-generation ARM chips that's not possible for under $200+
@geerlingguy – I totally forgot about the MochaBin-5G that might fit into the budget. But after the EspressoBin hassles no more Globalscale products for me :)
Interesting. How is the kernel supposed to be updated then? You were talking about being 'Debian based' now. What does this mean exactly?
The RPi and wireless firmware packages are provided via Debian packages as usual. We will continue to provide patches against the latest Raspberry Pi Kernel until support for the RTL8367RB is mainlined, and then until support is no longer needed.
Our provided OS is now based on the latest compatible kernel (for us that is currently 5.10.63) plus the Debian OS.
that's great news because it should enable to use PSUs with max of 150-200W, right? The list of tested PSU on Axzez's FAQ seems to miss models in SFX/TFX and Nano/PicoPSU form factor yet. Could that be added?
Yes, but actually it should enable PSUs with even lower max.
I spoke with my co-founder and we will order a PicoPSU now and test that. I'll look into the others. I will update here and the FAQ when I have results.
@geerlingguy do you happen to have any SFX/TFX PSUs on hand there? If not, I'll pick up this model @mi-hol (unless you want to suggest another @mi-hol): https://amzn.to/3rD86l0
Also, @mi-hol, I would bet that anything Seasonic would be just fine. I've not run into a Seasonic with an unreliable +3.3v rail yet. I see they have SFX and TFX models.
@shanzez your selection for SFX PSU looks good to me.
Finding a PicoPSU that is available in all parts of the world might be a bit difficult. In Europe the brand Inter-Tech seems widely available. @ThomasKaiser maybe you'd have a hint?
Another small PSU looking good from data sheet is https://www.silverstonetek.com/product.php?pid=969&tno=8&tb=116&area=en
maybe you'd have a hint?
Nope. Doesn't PicoPSU + ATX always result in unnecessary but wasteful DC-DC circuitry?
Finding a PicoPSU that is available in all parts of the world might be a bit difficult. In Europe the brand Inter-Tech seems widely available.
For PicoPSU I ordered this one: https://www.amazon.com/dp/B005TWE6B8?psc=1&smid=AVWGG3PF13OL8&ref_=chk_typ_imgToDp
Is that good enough?
Doesn't PicoPSU + ATX always result in unnecessary but wasteful DC-DC circuitry?
That could be. I'm here to explore ideas, and I don't mind dipping a bit into our funds for a couple PSUs that customers may be curious about. If there is a more optimized PSU you think is worthy testing, please let me know.
PicoPSU is useful in cases where you want the PSU to be basically external (the 12V brick), and with those bricks, you can get really good or really bad ones... some aren't bad at all, but basically you have to test every one since quality control is all over the board.
I will say I've tested the Turing Pi 2 board with a Pico PSU (this one from RGEEK, with this COOLM 12V adapter), with a Corsair SF600, and with a Redragon 700W PSU, and it worked great with all three.
I wish Amazon stocked actual name brands and not [take letters in alphabet + randomize them] brands... the tough thing is many other sites have very slow shipping times whereas I've been able to get 1-2 day shipping on Amazon for most of the year.
Axzez just announced their new Interceptor CM4 Carrier Board, which has: