geerlingguy / sbc-reviews

Jeff Geerling's SBC review data - Raspberry Pi, Radxa, Orange Pi, etc.
MIT License
350 stars 9 forks source link

Pine64 SOQuartz #7

Open geerlingguy opened 1 year ago

geerlingguy commented 1 year ago

Pine64-SOQuartz-Focus-Stacked

Basic information

Linux/system information

# output of `neofetch`
       _,met$$$$$gg.          root@DietPi 
    ,g$$$$$$$$$$$$$$$P.       ----------- 
  ,g$$P"     """Y$$.".        OS: Debian GNU/Linux 12 (bookworm) aarch64 
 ,$$P'              `$$$.     Host: Pine64 RK3566 SoQuartz with CM4-IO Carrier Board 
',$$P       ,ggs.     `$$b:   Kernel: 6.5.8 
`d$$'     ,$P"'   .    $$$    Uptime: 5 mins 
 $$P      d$'     ,    $$P    Packages: 208 (dpkg) 
 $$:      $$.   -    ,d$$'    Shell: bash 5.2.15 
 $$;      Y$b._   _,d$P'      Resolution: 1920x1080 
 Y$$.    `.`"Y$$$$P"'         Terminal: /dev/pts/0 
 `$$b      "-.__              CPU: (4) @ 1.800GHz 
  `Y$$                        Memory: 95MiB / 1913MiB 
   `Y$$.
     `$$b.                                            
       `Y$$b.                                         
          `"Y$b._
              `"""

# output of `uname -a`
Linux DietPi 6.5.8 #1 SMP PREEMPT Sat Oct 21 20:01:59 UTC 2023 aarch64 GNU/Linux

Benchmark results

CPU

Power

Disk

SanDisk Extreme 16GB A2 microSD

Benchmark Result
fio 1M sequential read 23.9 MB/s
iozone 1M random read 22.66 MB/s
iozone 1M random write 21.66 MB/s
iozone 4K random read 8.75 MB/s
iozone 4K random write 2.05 MB/s

curl https://raw.githubusercontent.com/geerlingguy/pi-cluster/master/benchmarks/disk-benchmark.sh | sudo bash

Run benchmark on any attached storage device (e.g. eMMC, microSD, NVMe, SATA) and add results under an additional heading. Download the script with curl -o disk-benchmark.sh [URL_HERE] and run sudo DEVICE_UNDER_TEST=/dev/sda DEVICE_MOUNT_PATH=/mnt/sda1 ./disk-benchmark.sh (assuming the device is sda).

Also consider running PiBenchmarks.com script.

Network

iperf3 results:

(Be sure to test all interfaces, noting any that are non-functional.)

Memory

tinymembench results:

Click to expand memory benchmark result ``` tinymembench v0.4.10 (simple benchmark for memory throughput and latency) ========================================================================== == Memory bandwidth tests == == == == Note 1: 1MB = 1000000 bytes == == Note 2: Results for 'copy' tests show how many bytes can be == == copied per second (adding together read and writen == == bytes would have provided twice higher numbers) == == Note 3: 2-pass copy means that we are using a small temporary buffer == == to first fetch data into it, and only then write it to the == == destination (source -> L1 cache, L1 cache -> destination) == == Note 4: If sample standard deviation exceeds 0.1%, it is shown in == == brackets == ========================================================================== C copy backwards : 2065.0 MB/s C copy backwards (32 byte blocks) : 1961.8 MB/s C copy backwards (64 byte blocks) : 1780.9 MB/s (0.2%) C copy : 2995.2 MB/s C copy prefetched (32 bytes step) : 2013.3 MB/s C copy prefetched (64 bytes step) : 3040.7 MB/s C 2-pass copy : 2166.8 MB/s C 2-pass copy prefetched (32 bytes step) : 1378.6 MB/s (0.2%) C 2-pass copy prefetched (64 bytes step) : 1452.3 MB/s C fill : 5762.5 MB/s C fill (shuffle within 16 byte blocks) : 5763.6 MB/s C fill (shuffle within 32 byte blocks) : 5763.6 MB/s C fill (shuffle within 64 byte blocks) : 5760.9 MB/s NEON 64x2 COPY : 2992.4 MB/s NEON 64x2x4 COPY : 2992.4 MB/s NEON 64x1x4_x2 COPY : 2979.5 MB/s NEON 64x2 COPY prefetch x2 : 2573.9 MB/s NEON 64x2x4 COPY prefetch x1 : 2388.4 MB/s NEON 64x2 COPY prefetch x1 : 2523.1 MB/s NEON 64x2x4 COPY prefetch x1 : 2388.4 MB/s --- standard memcpy : 2974.3 MB/s standard memset : 5768.3 MB/s --- NEON LDP/STP copy : 2993.3 MB/s NEON LDP/STP copy pldl2strm (32 bytes step) : 2309.6 MB/s NEON LDP/STP copy pldl2strm (64 bytes step) : 2788.7 MB/s NEON LDP/STP copy pldl1keep (32 bytes step) : 2312.9 MB/s NEON LDP/STP copy pldl1keep (64 bytes step) : 3035.3 MB/s NEON LD1/ST1 copy : 2991.7 MB/s NEON STP fill : 5769.0 MB/s NEON STNP fill : 2047.2 MB/s (0.4%) ARM LDP/STP copy : 2993.5 MB/s ARM STP fill : 5767.6 MB/s ARM STNP fill : 2050.8 MB/s (0.6%) ========================================================================== == Framebuffer read tests. == == == == Many ARM devices use a part of the system memory as the framebuffer, == == typically mapped as uncached but with write-combining enabled. == == Writes to such framebuffers are quite fast, but reads are much == == slower and very sensitive to the alignment and the selection of == == CPU instructions which are used for accessing memory. == == == == Many x86 systems allocate the framebuffer in the GPU memory, == == accessible for the CPU via a relatively slow PCI-E bus. Moreover, == == PCI-E is asymmetric and handles reads a lot worse than writes. == == == == If uncached framebuffer reads are reasonably fast (at least 100 MB/s == == or preferably >300 MB/s), then using the shadow framebuffer layer == == is not necessary in Xorg DDX drivers, resulting in a nice overall == == performance improvement. For example, the xf86-video-fbturbo DDX == == uses this trick. == ========================================================================== NEON LDP/STP copy (from framebuffer) : 2953.5 MB/s NEON LDP/STP 2-pass copy (from framebuffer) : 2089.0 MB/s NEON LD1/ST1 copy (from framebuffer) : 2952.2 MB/s NEON LD1/ST1 2-pass copy (from framebuffer) : 2062.2 MB/s ARM LDP/STP copy (from framebuffer) : 2952.8 MB/s ARM LDP/STP 2-pass copy (from framebuffer) : 2089.0 MB/s ========================================================================== == Memory latency test == == == == Average time is measured for random memory accesses in the buffers == == of different sizes. The larger is the buffer, the more significant == == are relative contributions of TLB, L1/L2 cache misses and SDRAM == == accesses. For extremely large buffer sizes we are expecting to see == == page table walk with several requests to SDRAM for almost every == == memory access (though 64MiB is not nearly large enough to experience == == this effect to its fullest). == == == == Note 1: All the numbers are representing extra time, which needs to == == be added to L1 cache latency. The cycle timings for L1 cache == == latency can be usually found in the processor documentation. == == Note 2: Dual random read means that we are simultaneously performing == == two independent memory accesses at a time. In the case if == == the memory subsystem can't handle multiple outstanding == == requests, dual random read has the same timings as two == == single reads performed one after another. == ========================================================================== block size : single random read / dual random read, [MADV_NOHUGEPAGE] 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.8 ns / 1.4 ns 32768 : 4.6 ns / 7.3 ns 65536 : 10.1 ns / 14.1 ns 131072 : 12.8 ns / 16.9 ns 262144 : 15.3 ns / 18.5 ns 524288 : 18.3 ns / 21.3 ns 1048576 : 84.4 ns / 125.8 ns 2097152 : 119.9 ns / 160.2 ns 4194304 : 138.5 ns / 172.8 ns 8388608 : 155.0 ns / 188.3 ns 16777216 : 164.8 ns / 198.2 ns 33554432 : 170.3 ns / 205.3 ns 67108864 : 173.8 ns / 209.8 ns block size : single random read / dual random read, [MADV_HUGEPAGE] 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.5 ns / 1.0 ns 32768 : 4.5 ns / 7.0 ns 65536 : 10.0 ns / 14.0 ns 131072 : 12.9 ns / 17.0 ns 262144 : 15.3 ns / 18.5 ns 524288 : 18.6 ns / 21.3 ns 1048576 : 84.5 ns / 125.9 ns 2097152 : 120.1 ns / 160.4 ns 4194304 : 137.7 ns / 171.9 ns 8388608 : 146.8 ns / 176.1 ns 16777216 : 151.4 ns / 177.8 ns 33554432 : 153.6 ns / 178.5 ns 67108864 : 154.7 ns / 178.8 ns ```

sbc-bench results

Run sbc-bench and paste a link to the results here:

wget https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/sbc-bench.sh
sudo /bin/bash ./sbc-bench.sh -r

Phoronix Test Suite

Results from pi-general-benchmark.sh:

geerlingguy commented 7 months ago

Attempting to do some more benchmarking using the DietPi Bookworm release linked from the SOQuartz Wiki.

DietPi seemed to boot perfectly on the first try, and HDMI to my 1080P monitor was fine. Printed out the LAN IP and I could SSH in as root/dietpi. Upon login, it ran apt updates. Unexpected, but okay I guess :P

The install wizard was a bit jarring, and dropped me through a number of (IMO) fairly complex installation questions. I guess it's nice for some use cases but I just wanted Debian and to move on with things :)

geerlingguy commented 7 months ago

Running Geekbench 6... I got an OOM error, and geekbench was killed. This module has 2GB of RAM and by default there's 128 MB swap.