geerlingguy / sbc-reviews

Jeff Geerling's SBC review data - Raspberry Pi, Radxa, Orange Pi, etc.
MIT License
520 stars 12 forks source link

Raspberry Pi 4 model B #4

Open geerlingguy opened 1 year ago

geerlingguy commented 1 year ago

DSC07091

Basic information

Linux/system information

# output of `neofetch`
       _,met$$$$$gg.          pi@hqcam 
    ,g$$$$$$$$$$$$$$$P.       -------- 
  ,g$$P"     """Y$$.".        OS: Debian GNU/Linux 11 (bullseye) aarch64 
 ,$$P'              `$$$.     Host: Raspberry Pi 4 Model B Rev 1.4 
',$$P       ,ggs.     `$$b:   Kernel: 5.15.84-v8+ 
`d$$'     ,$P"'   .    $$$    Uptime: 16 secs 
 $$P      d$'     ,    $$P    Packages: 597 (dpkg) 
 $$:      $$.   -    ,d$$'    Shell: bash 5.1.4 
 $$;      Y$b._   _,d$P'      Terminal: /dev/pts/0 
 Y$$.    `.`"Y$$$$P"'         CPU: BCM2835 (4) @ 1.800GHz 
 `$$b      "-.__              Memory: 94MiB / 7812MiB 
  `Y$$
   `Y$$.                                              
     `$$b.                                            
       `Y$$b.
          `"Y$b._
              `"""

# output of `uname -a`
Linux hqcam 5.15.84-v8+ #1613 SMP PREEMPT Thu Jan 5 12:03:08 GMT 2023 aarch64 GNU/Linux

Benchmark results

CPU

Power

Disk

SanDisk Extreme 32GB A1

Benchmark Result
fio 1M sequential read 46.0 MB/s
iozone 1M random read 42.56 MB/s
iozone 1M random write 36.16 MB/s
iozone 4K random read 10.24 MB/s
iozone 4K random write 5.01 MB/s

curl https://raw.githubusercontent.com/geerlingguy/pi-cluster/master/benchmarks/disk-benchmark.sh | sudo bash

Run benchmark on any attached storage device (e.g. eMMC, microSD, NVMe, SATA) and add results under an additional heading. Download the script with curl -o disk-benchmark.sh [URL_HERE] and run sudo DEVICE_UNDER_TEST=/dev/sda DEVICE_MOUNT_PATH=/mnt/sda1 ./disk-benchmark.sh (assuming the device is sda).

Also consider running PiBenchmarks.com script.

Network

iperf3 results:

Ethernet

WiFi

GPU

glmark2-es2 result:

=======================================================
    glmark2 2023.01
=======================================================
    OpenGL Information
    GL_VENDOR:      Broadcom
    GL_RENDERER:    V3D 4.2
    GL_VERSION:     OpenGL ES 3.1 Mesa 23.2.1-1~bpo12+rpt3
    Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
    Surface Size:   800x600 windowed
=======================================================
[build] use-vbo=false: FPS: 845 FrameTime: 1.184 ms
[build] use-vbo=true: FPS: 1335 FrameTime: 0.749 ms
[texture] texture-filter=nearest: FPS: 1130 FrameTime: 0.886 ms
[texture] texture-filter=linear: FPS: 1108 FrameTime: 0.903 ms
[texture] texture-filter=mipmap: FPS: 1099 FrameTime: 0.911 ms
[shading] shading=gouraud: FPS: 1081 FrameTime: 0.925 ms
[shading] shading=blinn-phong-inf: FPS: 885 FrameTime: 1.131 ms
[shading] shading=phong: FPS: 699 FrameTime: 1.432 ms
[shading] shading=cel: FPS: 664 FrameTime: 1.507 ms
[bump] bump-render=high-poly: FPS: 575 FrameTime: 1.740 ms
[bump] bump-render=normals: FPS: 1140 FrameTime: 0.878 ms
[bump] bump-render=height: FPS: 1061 FrameTime: 0.943 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 422 FrameTime: 2.374 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 225 FrameTime: 4.459 ms
[pulsar] light=false:quads=5:texture=false: FPS: 1177 FrameTime: 0.850 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 125 FrameTime: 8.011 ms
[desktop] effect=shadow:windows=4: FPS: 435 FrameTime: 2.302 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 156 FrameTime: 6.419 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 162 FrameTime: 6.207 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 199 FrameTime: 5.028 ms
[ideas] speed=duration: FPS: 744 FrameTime: 1.345 ms
[jellyfish] <default>: FPS: 419 FrameTime: 2.390 ms
[terrain] <default>: FPS: 28 FrameTime: 36.164 ms
[shadow] <default>: FPS: 112 FrameTime: 8.937 ms
[refract] <default>: FPS: 43 FrameTime: 23.685 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 1254 FrameTime: 0.798 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 692 FrameTime: 1.445 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 1205 FrameTime: 0.830 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 965 FrameTime: 1.037 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 611 FrameTime: 1.638 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 922 FrameTime: 1.085 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 927 FrameTime: 1.079 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 589 FrameTime: 1.700 ms
=======================================================
                                  glmark2 Score: 697 
=======================================================

Memory

tinymembench results:

Click to expand memory benchmark result ``` tinymembench v0.4.10 (simple benchmark for memory throughput and latency) ========================================================================== == Memory bandwidth tests == == == == Note 1: 1MB = 1000000 bytes == == Note 2: Results for 'copy' tests show how many bytes can be == == copied per second (adding together read and writen == == bytes would have provided twice higher numbers) == == Note 3: 2-pass copy means that we are using a small temporary buffer == == to first fetch data into it, and only then write it to the == == destination (source -> L1 cache, L1 cache -> destination) == == Note 4: If sample standard deviation exceeds 0.1%, it is shown in == == brackets == ========================================================================== C copy backwards : 2747.5 MB/s (1.7%) C copy backwards (32 byte blocks) : 2757.0 MB/s (0.1%) C copy backwards (64 byte blocks) : 2749.6 MB/s C copy : 2731.0 MB/s C copy prefetched (32 bytes step) : 2726.8 MB/s C copy prefetched (64 bytes step) : 2727.8 MB/s C 2-pass copy : 2189.6 MB/s (0.4%) C 2-pass copy prefetched (32 bytes step) : 2307.0 MB/s C 2-pass copy prefetched (64 bytes step) : 2292.3 MB/s (0.3%) C fill : 3126.3 MB/s (1.3%) C fill (shuffle within 16 byte blocks) : 3122.2 MB/s (0.9%) C fill (shuffle within 32 byte blocks) : 3105.8 MB/s (0.9%) C fill (shuffle within 64 byte blocks) : 3110.4 MB/s (0.9%) NEON 64x2 COPY : 2735.7 MB/s NEON 64x2x4 COPY : 2734.0 MB/s NEON 64x1x4_x2 COPY : 1099.1 MB/s (0.2%) NEON 64x2 COPY prefetch x2 : 2728.2 MB/s NEON 64x2x4 COPY prefetch x1 : 2725.5 MB/s NEON 64x2 COPY prefetch x1 : 2726.2 MB/s NEON 64x2x4 COPY prefetch x1 : 2728.5 MB/s --- standard memcpy : 2737.5 MB/s standard memset : 3102.7 MB/s (0.9%) --- NEON LDP/STP copy : 2731.7 MB/s NEON LDP/STP copy pldl2strm (32 bytes step) : 2717.2 MB/s NEON LDP/STP copy pldl2strm (64 bytes step) : 2718.5 MB/s NEON LDP/STP copy pldl1keep (32 bytes step) : 2728.9 MB/s NEON LDP/STP copy pldl1keep (64 bytes step) : 2731.1 MB/s NEON LD1/ST1 copy : 2733.4 MB/s NEON STP fill : 3111.4 MB/s (1.1%) NEON STNP fill : 2701.2 MB/s (0.9%) ARM LDP/STP copy : 2735.1 MB/s ARM STP fill : 3084.1 MB/s (0.9%) ARM STNP fill : 2640.1 MB/s (1.3%) ========================================================================== == Memory latency test == == == == Average time is measured for random memory accesses in the buffers == == of different sizes. The larger is the buffer, the more significant == == are relative contributions of TLB, L1/L2 cache misses and SDRAM == == accesses. For extremely large buffer sizes we are expecting to see == == page table walk with several requests to SDRAM for almost every == == memory access (though 64MiB is not nearly large enough to experience == == this effect to its fullest). == == == == Note 1: All the numbers are representing extra time, which needs to == == be added to L1 cache latency. The cycle timings for L1 cache == == latency can be usually found in the processor documentation. == == Note 2: Dual random read means that we are simultaneously performing == == two independent memory accesses at a time. In the case if == == the memory subsystem can't handle multiple outstanding == == requests, dual random read has the same timings as two == == single reads performed one after another. == ========================================================================== block size : single random read / dual random read 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.0 ns / 0.0 ns 32768 : 0.0 ns / 0.0 ns 65536 : 4.7 ns / 7.4 ns 131072 : 7.2 ns / 9.9 ns 262144 : 10.3 ns / 13.2 ns 524288 : 11.9 ns / 15.1 ns 1048576 : 22.7 ns / 34.8 ns 2097152 : 80.9 ns / 117.8 ns 4194304 : 108.9 ns / 140.9 ns 8388608 : 129.4 ns / 161.1 ns 16777216 : 139.8 ns / 170.3 ns 33554432 : 145.1 ns / 175.4 ns 67108864 : 156.5 ns / 191.4 ns ```

Phoronix Test Suite

Results of the pi-general-benchmark.sh:

Other Data

Crypto performance as measured by OpenSSL (see sbc-bench ARMv8 Crypto Extensions):

pi@raspberrypi:~ $ openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 5145475 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 1378033 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 351656 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 88374 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 11062 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 5531 aes-256-cbc's in 3.00s
OpenSSL 1.1.1n  15 Mar 2022
built on: Wed Feb  8 14:21:54 2023 UTC
options:bn(64,64) rc4(char) des(int) aes(partial) blowfish(ptr) 
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -ffile-prefix-map=/build/openssl-ysjt2m/openssl-1.1.1n=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DVPAES_ASM -DBSAES_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc      27442.53k    29398.04k    30007.98k    30164.99k    30206.63k    30206.63k
geerlingguy commented 1 year ago

Benchmark of SD card on PiBenchmarks.com: https://pibenchmarks.com/benchmark/67793/

github-actions[bot] commented 1 year ago

This issue has been marked 'stale' due to lack of recent activity. If there is no further activity, the issue will be closed in another 30 days. Thank you for your contribution!

Please read this blog post to see the reasons why I mark issues as stale.

arekm commented 9 months ago

I wonder about one thing. You seem to never experienced rpi 4 wifi instability (causing lockup (only wifi lockup) requiring hard reset to get wifi working again). It's daily (or rather hourly) experience here on every rpi 4 board (but only have 4GB, rev 1.5 boards; official PSU). Leaving iperf -t0 running is enough to trigger.

k-koeberl commented 8 months ago

May be this question is wrong here but some of you may have seen the same effect as myself. I am using Sandisk Ultra Fit from 32GB up to 512GB for a long time now on my RPI3 and RPI4 boards and they work fine. Lately I have ordered "new" 256GB Ultra Fit's and they are unusable slow.

This are the parameters of the "old" ones that are working well SanDisk Ultra Fit 256GB (made in China)

Bus 002 Device 008: ID 0781:5583 SanDisk Corp. Ultra Fit Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.00 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x0781 SanDisk Corp. idProduct 0x5583 Ultra Fit bcdDevice 1.00 iManufacturer 1 SanDisk iProduct 2 Ultra Fit iSerial 3 0401a026bcbb31d7dc3a43a4fd6a281e69edd0bfafede1b68eed300273ea9b981c960000000000000000000090719afaff806e18835581077c271bb5 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x002c bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 896mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 8 Mass Storage bInterfaceSubClass 6 SCSI bInterfaceProtocol 80 Bulk-Only iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 1 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 15 Binary Object Store Descriptor: bLength 5 bDescriptorType 15 wTotalLength 0x0016 bNumDeviceCaps 2 USB 2.0 Extension Device Capability: bLength 7 bDescriptorType 16 bDevCapabilityType 2 bmAttributes 0x00000002 HIRD Link Power Management (LPM) Supported SuperSpeed USB Device Capability: bLength 10 bDescriptorType 16 bDevCapabilityType 3 bmAttributes 0x00 wSpeedsSupported 0x000e Device can operate at Full Speed (12Mbps) Device can operate at High Speed (480Mbps) Device can operate at SuperSpeed (5Gbps) bFunctionalitySupport 1 Lowest fully-functional device speed is Full Speed (12Mbps) bU1DevExitLat 10 micro seconds bU2DevExitLat 256 micro seconds Device Status: 0x0000 (Bus Powered)

This are the "new" ones that are unusable under Linux SanDisk Ultra Fit 256GB (made in Taiwan)

Bus 002 Device 007: ID 0781:55b1 SanDisk Corp. SanDisk 3.2 Gen1 Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.20 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x0781 SanDisk Corp. idProduct 0x55b1 bcdDevice 1.10 iManufacturer 1 SanDisk iProduct 2 SanDisk 3.2 Gen1 iSerial 3 A2003921444C4129 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x002c bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 896mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 8 Mass Storage bInterfaceSubClass 6 SCSI bInterfaceProtocol 80 Bulk-Only iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 3 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 3 Binary Object Store Descriptor: bLength 5 bDescriptorType 15 wTotalLength 0x0016 bNumDeviceCaps 2 USB 2.0 Extension Device Capability: bLength 7 bDescriptorType 16 bDevCapabilityType 2 bmAttributes 0x00000006 BESL Link Power Management (LPM) Supported SuperSpeed USB Device Capability: bLength 10 bDescriptorType 16 bDevCapabilityType 3 bmAttributes 0x00 wSpeedsSupported 0x000e Device can operate at Full Speed (12Mbps) Device can operate at High Speed (480Mbps) Device can operate at SuperSpeed (5Gbps) bFunctionalitySupport 2 Lowest fully-functional device speed is High Speed (480Mbps) bU1DevExitLat 10 micro seconds bU2DevExitLat 2047 micro seconds Device Status: 0x000c (Bus Powered) U1 Enabled U2 Enabled

Under Windows both are working at the same speed and without any problem. I also tried to get further information from the vendor but the only comment was that the USB flash drives are only certified for Windows. Has anyone seen similar effects ? Are there any parameters that could make the new Sandisk Ultra Fits work under Linux ?