Radxa CM5 - Githubissues

geerlingguy commented 3 months ago

radxa-cm5-som

Basic information

Board URL (official): https://radxa.com/products/computer-on-module/cm/cm5
Board purchased from: ARACE
Board purchase date: April 2, 2024
Board specs (as tested): CM5 (RK3588S2) / 8GB RAM / 64GB eMMC
Board price (as tested): $88

Linux/system information

# output of `neofetch`
       _,met$$$$$gg.          rock@radxa-cm5-io 
    ,g$$$$$$$$$$$$$$$P.       ----------------- 
  ,g$$P"     """Y$$.".        OS: Debian GNU/Linux 11 (bullseye) aarch64 
 ,$$P'              `$$$.     Host: Radxa CM5 IO 
',$$P       ,ggs.     `$$b:   Kernel: 5.10.110-33-rockchip 
`d$$'     ,$P"'   .    $$$    Uptime: 13 mins 
 $$P      d$'     ,    $$P    Packages: 1608 (dpkg) 
 $$:      $$.   -    ,d$$'    Shell: bash 5.1.4 
 $$;      Y$b._   _,d$P'      Terminal: /dev/pts/0 
 Y$$.    `.`"Y$$$$P"'         CPU: (8) @ 1.800GHz 
 `$$b      "-.__              Memory: 304MiB / 7945MiB 
  `Y$$
   `Y$$.                                              
     `$$b.                                            
       `Y$$b.
          `"Y$b._
              `"""

# output of `uname -a`
Linux radxa-cm5-io 5.10.110-33-rockchip #65700d485 SMP Wed Apr 3 04:26:57 UTC 2024 aarch64 GNU/Linux

Benchmark results

CPU

Geekbench 6: (768 single / 2990 multi - https://browser.geekbench.com/v6/cpu/6129925)
48.619 Gflops at 10W (4.86 Gflops/W) (HPL result)

Power

Idle power draw (at wall): 1.2 W
Maximum simulated power draw (stress-ng --matrix 0): 8.4 W
During Geekbench multicore benchmark: 9.6 W
During top500 HPL benchmark: 10 W

Disk

Kioxia Exceria 32GB microSD card

Benchmark	Result
iozone 4K random read	9.11 MB/s
iozone 4K random write	4.95 MB/s
iozone 1M random read	74.20 MB/s
iozone 1M random write	18.66 MB/s
iozone 1M sequential read	74.61 MB/s
iozone 1M sequential write	17.53 MB/s

Built-in 64GB eMMC

Benchmark	Result
iozone 4K random read	24.10 MB/s
iozone 4K random write	24.80 MB/s
iozone 1M random read	242.11 MB/s
iozone 1M random write	208.68 MB/s
iozone 1M sequential read	245.78 MB/s
iozone 1M sequential write	210.05 MB/s

wget https://raw.githubusercontent.com/geerlingguy/pi-cluster/master/benchmarks/disk-benchmark.sh
chmod +x disk-benchmark.sh
sudo MOUNT_PATH=/ TEST_SIZE=1g ./disk-benchmark.sh

Also consider running PiBenchmarks.com script.

Network

iperf3 results:

iperf3 -c $SERVER_IP: 940 Mbps
iperf3 --reverse -c $SERVER_IP: 842 Mbps
iperf3 --bidir -c $SERVER_IP: 935 Mbps up, 305 Mbps down

(Be sure to test all interfaces, noting any that are non-functional.)

GPU

glmark2-es2 results:

1. Install glmark2-es2 with `sudo apt install -y glmark2-es2`
2. Run `glmark2-es2`
3. Replace this block of text with the results.

Note: This benchmark requires an active display on the device. Not all devices may be able to run glmark2-es2, so in that case, make a note and move on!

TODO: See this issue for discussion about a full suite of standardized GPU benchmarks.

Memory

tinymembench results:

Click to expand memory benchmark result

``` tinymembench v0.4.10 (simple benchmark for memory throughput and latency) ========================================================================== == Memory bandwidth tests == == == == Note 1: 1MB = 1000000 bytes == == Note 2: Results for 'copy' tests show how many bytes can be == == copied per second (adding together read and writen == == bytes would have provided twice higher numbers) == == Note 3: 2-pass copy means that we are using a small temporary buffer == == to first fetch data into it, and only then write it to the == == destination (source -> L1 cache, L1 cache -> destination) == == Note 4: If sample standard deviation exceeds 0.1%, it is shown in == == brackets == ========================================================================== C copy backwards : 11994.7 MB/s (1.1%) C copy backwards (32 byte blocks) : 11957.6 MB/s C copy backwards (64 byte blocks) : 11956.1 MB/s C copy : 12284.8 MB/s C copy prefetched (32 bytes step) : 12394.5 MB/s (0.1%) C copy prefetched (64 bytes step) : 12484.7 MB/s (0.2%) C 2-pass copy : 5405.9 MB/s (0.2%) C 2-pass copy prefetched (32 bytes step) : 7835.1 MB/s C 2-pass copy prefetched (64 bytes step) : 8288.2 MB/s C fill : 29949.5 MB/s (0.2%) C fill (shuffle within 16 byte blocks) : 29945.1 MB/s C fill (shuffle within 32 byte blocks) : 29920.9 MB/s C fill (shuffle within 64 byte blocks) : 29938.2 MB/s NEON 64x2 COPY : 12649.6 MB/s NEON 64x2x4 COPY : 12606.4 MB/s NEON 64x1x4_x2 COPY : 5260.5 MB/s (0.5%) NEON 64x2 COPY prefetch x2 : 11628.3 MB/s NEON 64x2x4 COPY prefetch x1 : 12005.2 MB/s NEON 64x2 COPY prefetch x1 : 11700.2 MB/s NEON 64x2x4 COPY prefetch x1 : 12005.8 MB/s --- standard memcpy : 12605.4 MB/s standard memset : 29860.6 MB/s (0.2%) --- NEON LDP/STP copy : 12631.2 MB/s NEON LDP/STP copy pldl2strm (32 bytes step) : 12495.8 MB/s NEON LDP/STP copy pldl2strm (64 bytes step) : 12529.5 MB/s NEON LDP/STP copy pldl1keep (32 bytes step) : 12671.0 MB/s NEON LDP/STP copy pldl1keep (64 bytes step) : 12666.6 MB/s NEON LD1/ST1 copy : 12590.8 MB/s NEON STP fill : 29816.8 MB/s (0.2%) NEON STNP fill : 29845.4 MB/s (0.2%) ARM LDP/STP copy : 12601.7 MB/s ARM STP fill : 29792.4 MB/s (0.2%) ARM STNP fill : 29791.0 MB/s (0.1%) ========================================================================== == Memory latency test == == == == Average time is measured for random memory accesses in the buffers == == of different sizes. The larger is the buffer, the more significant == == are relative contributions of TLB, L1/L2 cache misses and SDRAM == == accesses. For extremely large buffer sizes we are expecting to see == == page table walk with several requests to SDRAM for almost every == == memory access (though 64MiB is not nearly large enough to experience == == this effect to its fullest). == == == == Note 1: All the numbers are representing extra time, which needs to == == be added to L1 cache latency. The cycle timings for L1 cache == == latency can be usually found in the processor documentation. == == Note 2: Dual random read means that we are simultaneously performing == == two independent memory accesses at a time. In the case if == == the memory subsystem can't handle multiple outstanding == == requests, dual random read has the same timings as two == == single reads performed one after another. == ========================================================================== block size : single random read / dual random read 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.0 ns / 0.0 ns 32768 : 0.0 ns / 0.0 ns 65536 : 0.0 ns / 0.0 ns 131072 : 1.1 ns / 1.5 ns 262144 : 1.9 ns / 2.8 ns 524288 : 3.5 ns / 4.0 ns 1048576 : 9.5 ns / 12.4 ns 2097152 : 13.4 ns / 15.3 ns 4194304 : 60.2 ns / 94.8 ns 8388608 : 143.5 ns / 199.0 ns 16777216 : 190.0 ns / 145.9 ns 33554432 : 215.0 ns / 249.5 ns 67108864 : 228.2 ns / 255.4 ns ```

`sbc-bench` results

Run sbc-bench and paste a link to the results here:

https://sprunge.us/2eal0B

Phoronix Test Suite

Results from pi-general-benchmark.sh:

pts/encode-mp3: 12.165 sec
pts/x264 4K: 4.03 fps
pts/x264 1080p: 22.28 fps
pts/phpbench: 373577
pts/build-linux-kernel (defconfig): 1427.987 sec

Qvy-png commented 2 months ago

Any news ?

geerlingguy commented 2 months ago

Nope, still no notification of shipment.

[Edit: Just got shipment notification on May 2]