Open geerlingguy opened 3 months ago
Any news ?
Nope, still no notification of shipment.
[Edit: Just got shipment notification on May 2]
Received it today, popped it on my Raspberry Pi Compute Module 4 IO Board, and powered it up.
There's a blue LED on the CM5 right next to the board label (CM5 V2.21), and the Ethernet LEDs are lit, with the activity LED blinking, so it's definitely doing something. No HDMI output at least on HDMI0.
The product page says "Compatible with multiple IO boards" and lists the Pi CM4 IO Board, but maybe there are some settings I have to change.
I'm downloading radxa-cm5-io_debian_bullseye-test_kde_b1.img.xz from the docs / download page, and flashing it to a 32GB microSD card with Etcher.
Is there any jumper on the IO board that could cause a boot issue ? Perhaps it is just like the Milkv mars CM, where it requires a serial connection
After flashing the microSD card, it is booting. Still no HDMI out of HDMI0 or HDMI1, but I can log in over SSH with rock
/rock
, and will begin some testing.
Geekbench 6 power usage graph:
It's interesting; after a few minutes running full blast, the SoC seems to settle in at 8.4W when running stress-ng on all cores:
This is running with a large 120mm fan blasting across the board to keep the chip from hitting throttle limits. No heatsink for this test though:
Regarding HDMI output, I've opened CM5 - HDMI output on Pi CM4 IO Board on the Radxa community forum.
Some of the testing I'm doing seems to be fluctuate quite a bit without the performance
governor. The default is ondemand
and it works okay, but especially for bursty tests, performance is a little wonky.
(Note: I like to do benchmarks with the default settings, because that's more reflective of the experience end users will get—but for some I'm willing to set performance
just to get an idea of what the board can do, flat out.)
Edit: Even with performance
, it seems like at least Phoronix tests are wavering outside the general 1-3% performance range, causing it to re-test a bunch until it has a happy average. SoC temp is in the 40-45°C range, so throttling isn't in play.
The Radxa CM5 gets a quick mention in today's video on the LattePanda Mu.
Have you tested rhe CM5 to see if it works in the Super6c? I'd be willing to upgrade from the pi CM4 modules but it seems iffy that they're a clean drop-on replacement.
Basic information
Linux/system information
Benchmark results
CPU
Power
stress-ng --matrix 0
): 8.4 Wtop500
HPL benchmark: 10 WDisk
Kioxia Exceria 32GB microSD card
Built-in 64GB eMMC
Also consider running PiBenchmarks.com script.
Network
iperf3
results:iperf3 -c $SERVER_IP
: 940 Mbpsiperf3 --reverse -c $SERVER_IP
: 842 Mbpsiperf3 --bidir -c $SERVER_IP
: 935 Mbps up, 305 Mbps down(Be sure to test all interfaces, noting any that are non-functional.)
GPU
glmark2-es2
results:Note: This benchmark requires an active display on the device. Not all devices may be able to run
glmark2-es2
, so in that case, make a note and move on!TODO: See this issue for discussion about a full suite of standardized GPU benchmarks.
Memory
tinymembench
results:Click to expand memory benchmark result
``` tinymembench v0.4.10 (simple benchmark for memory throughput and latency) ========================================================================== == Memory bandwidth tests == == == == Note 1: 1MB = 1000000 bytes == == Note 2: Results for 'copy' tests show how many bytes can be == == copied per second (adding together read and writen == == bytes would have provided twice higher numbers) == == Note 3: 2-pass copy means that we are using a small temporary buffer == == to first fetch data into it, and only then write it to the == == destination (source -> L1 cache, L1 cache -> destination) == == Note 4: If sample standard deviation exceeds 0.1%, it is shown in == == brackets == ========================================================================== C copy backwards : 11994.7 MB/s (1.1%) C copy backwards (32 byte blocks) : 11957.6 MB/s C copy backwards (64 byte blocks) : 11956.1 MB/s C copy : 12284.8 MB/s C copy prefetched (32 bytes step) : 12394.5 MB/s (0.1%) C copy prefetched (64 bytes step) : 12484.7 MB/s (0.2%) C 2-pass copy : 5405.9 MB/s (0.2%) C 2-pass copy prefetched (32 bytes step) : 7835.1 MB/s C 2-pass copy prefetched (64 bytes step) : 8288.2 MB/s C fill : 29949.5 MB/s (0.2%) C fill (shuffle within 16 byte blocks) : 29945.1 MB/s C fill (shuffle within 32 byte blocks) : 29920.9 MB/s C fill (shuffle within 64 byte blocks) : 29938.2 MB/s NEON 64x2 COPY : 12649.6 MB/s NEON 64x2x4 COPY : 12606.4 MB/s NEON 64x1x4_x2 COPY : 5260.5 MB/s (0.5%) NEON 64x2 COPY prefetch x2 : 11628.3 MB/s NEON 64x2x4 COPY prefetch x1 : 12005.2 MB/s NEON 64x2 COPY prefetch x1 : 11700.2 MB/s NEON 64x2x4 COPY prefetch x1 : 12005.8 MB/s --- standard memcpy : 12605.4 MB/s standard memset : 29860.6 MB/s (0.2%) --- NEON LDP/STP copy : 12631.2 MB/s NEON LDP/STP copy pldl2strm (32 bytes step) : 12495.8 MB/s NEON LDP/STP copy pldl2strm (64 bytes step) : 12529.5 MB/s NEON LDP/STP copy pldl1keep (32 bytes step) : 12671.0 MB/s NEON LDP/STP copy pldl1keep (64 bytes step) : 12666.6 MB/s NEON LD1/ST1 copy : 12590.8 MB/s NEON STP fill : 29816.8 MB/s (0.2%) NEON STNP fill : 29845.4 MB/s (0.2%) ARM LDP/STP copy : 12601.7 MB/s ARM STP fill : 29792.4 MB/s (0.2%) ARM STNP fill : 29791.0 MB/s (0.1%) ========================================================================== == Memory latency test == == == == Average time is measured for random memory accesses in the buffers == == of different sizes. The larger is the buffer, the more significant == == are relative contributions of TLB, L1/L2 cache misses and SDRAM == == accesses. For extremely large buffer sizes we are expecting to see == == page table walk with several requests to SDRAM for almost every == == memory access (though 64MiB is not nearly large enough to experience == == this effect to its fullest). == == == == Note 1: All the numbers are representing extra time, which needs to == == be added to L1 cache latency. The cycle timings for L1 cache == == latency can be usually found in the processor documentation. == == Note 2: Dual random read means that we are simultaneously performing == == two independent memory accesses at a time. In the case if == == the memory subsystem can't handle multiple outstanding == == requests, dual random read has the same timings as two == == single reads performed one after another. == ========================================================================== block size : single random read / dual random read 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.0 ns / 0.0 ns 32768 : 0.0 ns / 0.0 ns 65536 : 0.0 ns / 0.0 ns 131072 : 1.1 ns / 1.5 ns 262144 : 1.9 ns / 2.8 ns 524288 : 3.5 ns / 4.0 ns 1048576 : 9.5 ns / 12.4 ns 2097152 : 13.4 ns / 15.3 ns 4194304 : 60.2 ns / 94.8 ns 8388608 : 143.5 ns / 199.0 ns 16777216 : 190.0 ns / 145.9 ns 33554432 : 215.0 ns / 249.5 ns 67108864 : 228.2 ns / 255.4 ns ```sbc-bench
resultsRun sbc-bench and paste a link to the results here:
https://sprunge.us/2eal0B
Phoronix Test Suite
Results from pi-general-benchmark.sh: