Open foxjony opened 2 weeks ago
The result for RK3328 might be hardware accelerated. Result for ZERO 3W might be pure CPU performance。
I did a Radxa Zero 3W speed test with radxa-zero3_debian_bullseye_xfce_b6.img firmware via sha256-benchmark. The test calculates how long it takes to generate 1,000,000 SHA-256 hashes (the shorter the better). Result: 15500 ms (Chip RK3566) For comparison NanoPi NEO3: 715 ms (Chip RK3328) Conclusion: Very slow (not sure why).
Hi, @foxjony :
Can you show the results of the NanoPi NEO3 running lscpu
?
@foxjony Please check the speed of the NanoPi NEO3 when running the following two commands, I suspect that the go mod you're using isn't successfully calling the sha-2 instruction set
rock@radxa-zero3:~$ openssl speed sha256
Doing sha256 for 3s on 16 size blocks: 10653225 sha256's in 3.00s
Doing sha256 for 3s on 64 size blocks: 8267065 sha256's in 3.00s
Doing sha256 for 3s on 256 size blocks: 4811747 sha256's in 3.00s
Doing sha256 for 3s on 1024 size blocks: 1796164 sha256's in 3.00s
Doing sha256 for 3s on 8192 size blocks: 263128 sha256's in 3.00s
Doing sha256 for 3s on 16384 size blocks: 133079 sha256's in 2.99s
OpenSSL 1.1.1w 11 Sep 2023
built on: Wed Sep 13 19:21:33 2023 UTC
options:bn(64,64) rc4(char) des(int) aes(partial) blowfish(ptr)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -ffile-prefix-map=/build/reproducible-path/openssl-1.1.1w=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DVPAES_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sha256 56817.20k 176364.05k 410602.41k 613090.65k 718514.86k 729219.51k
rock@radxa-zero3:~$ OPENSSL_armcap=0 openssl speed sha256
Doing sha256 for 3s on 16 size blocks: 3882521 sha256's in 3.00s
Doing sha256 for 3s on 64 size blocks: 2213713 sha256's in 3.00s
Doing sha256 for 3s on 256 size blocks: 978564 sha256's in 3.00s
Doing sha256 for 3s on 1024 size blocks: 302378 sha256's in 3.00s
Doing sha256 for 3s on 8192 size blocks: 40604 sha256's in 3.00s
Doing sha256 for 3s on 16384 size blocks: 20385 sha256's in 3.00s
OpenSSL 1.1.1w 11 Sep 2023
built on: Wed Sep 13 19:21:33 2023 UTC
options:bn(64,64) rc4(char) des(int) aes(partial) blowfish(ptr)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -ffile-prefix-map=/build/reproducible-path/openssl-1.1.1w=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DVPAES_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sha256 20706.78k 47225.88k 83504.13k 103211.69k 110875.99k 111329.28k
rock@radxa-zero3:~$
I did a Radxa Zero 3W speed test with radxa-zero3_debian_bullseye_xfce_b6.img firmware via sha256-benchmark. The test calculates how long it takes to generate 1,000,000 SHA-256 hashes (the shorter the better). Result: 15500 ms (Chip RK3566) For comparison NanoPi NEO3: 715 ms (Chip RK3328) Conclusion: Very slow (not sure why).