Closed potuz closed 1 year ago
Fantastic! Passes tests on rpi4b+, once I disable the crypto-extension ones:
Test hash_armv8_neon_one_block... [ OK ]
Test hash_armv8_neon_multiple_blocks... [ OK ]
Test hash_armv8_neon_x4... [ OK ]
SUCCESS: All unit tests have passed.
Benchmark, same raspi:
over_voltage=4
arm_freq=1900
[==========] Running 4 benchmarks.
[ RUN ] armv8.neon_x1_one_at_time
[ OK ] armv8.neon_x1_one_at_time (mean 63.132ms, confidence interval +- 0.150820%)
[ RUN ] armv8.neon_x1
[ OK ] armv8.neon_x1 (mean 62.639ms, confidence interval +- 0.055984%)
[ RUN ] armv8.neon_x4
[ OK ] armv8.neon_x4 (mean 46.197ms, confidence interval +- 0.040853%)
[ RUN ] openssl.openssl_one_at_time
[ OK ] openssl.openssl_one_at_time (mean 161.603ms, confidence interval +- 0.166258%)
[==========] 4 benchmarks ran.
[ PASSED ] 4 benchmarks.
It does not check at runtime for sha instructions in tests so tests will sigill on machines that do not support sha2 extensions (eg RPI4) unlike the amd64 counterpart that would notify the user.
this small patch avoids the problematic parts when testing on raspi, fwiw:
diff --git a/src/bench.c b/src/bench.c
index 5548125..6058884 100644
--- a/src/bench.c
+++ b/src/bench.c
@@ -74,6 +74,8 @@ UBENCH_EX(armv8, neon_x4) {
free(digest);
}
+#ifdef __ARM_FEATURE_CRYPTO
+
UBENCH_EX(armv8, crypto) {
int * buffer = (int *) malloc(buffer_size);
unsigned char * digest = (unsigned char *) malloc(buffer_size/2);
@@ -87,6 +89,8 @@ UBENCH_EX(armv8, crypto) {
free(digest);
}
+#endif
+
#endif
#ifdef __x86_64__
UBENCH_EX(sse, sse_x1_one_at_time) {
diff --git a/src/test.c b/src/test.c
index 66f803e..69439e4 100644
--- a/src/test.c
+++ b/src/test.c
@@ -622,7 +622,9 @@ TEST_LIST = {
{"hash_armv8_neon_one_block", test_hash_armv8_neon_x1_one_block},
{"hash_armv8_neon_multiple_blocks", test_hash_armv8_neon_x1_multiple_blocks},
{"hash_armv8_neon_x4", test_armv8_neon_x4},
+#ifdef __ARM_FEATURE_CRYPTO
{"hash_armv8_crypto_multiple_blocks", test_hash_armv8_crypto_multiple_blocks},
+#endif
#endif
{NULL, NULL}
};
Adding here benches courtesy of Joe Clapis
rock@proteus:~/hashtree$ HAVE_OPENSSL=1 make bench
make -C src bench
make[1]: Entering directory '/home/rock/hashtree/src'
cc -g -fpic -c -o sha256_armv8_neon_x4.o sha256_armv8_neon_x4.S
cc -g -fpic -c -o sha256_armv8_neon_x1.o sha256_armv8_neon_x1.S
cc -g -fpic -c -o sha256_armv8_crypto.o sha256_armv8_crypto.S
ar rcs libhashtree.a sha256_armv8_neon_x4.o sha256_armv8_neon_x1.o sha256_armv8_crypto.o
cc -g -Wall -Werror -DHAVE_OPENSSL -L . -o bench bench.c -lhashtree -lm -lcrypto
make[1]: Leaving directory '/home/rock/hashtree/src'
rock@proteus:~/hashtree$ src/bench
[==========] Running 5 benchmarks.
[ RUN ] armv8.neon_x1_one_at_time
[ OK ] armv8.neon_x1_one_at_time (mean 38.769ms, confidence interval +- 0.008991%)
[ RUN ] armv8.neon_x1
[ OK ] armv8.neon_x1 (mean 38.197ms, confidence interval +- 0.016511%)
[ RUN ] armv8.neon_x4
[ OK ] armv8.neon_x4 (mean 24.244ms, confidence interval +- 0.009266%)
[ RUN ] armv8.crypto
[ OK ] armv8.crypto (mean 6.049ms, confidence interval +- 0.011850%)
[ RUN ] openssl.openssl_one_at_time
[ OK ] openssl.openssl_one_at_time (mean 10.020ms, confidence interval +- 0.521858%)
[==========] 5 benchmarks ran.
[ PASSED ] 5 benchmarks.
Means that we're considerably beating openssl using both crypto extensions.
Adding here benches courtesy of Joe Clapis
rock@proteus:~/hashtree$ HAVE_OPENSSL=1 make bench make -C src bench make[1]: Entering directory '/home/rock/hashtree/src' cc -g -fpic -c -o sha256_armv8_neon_x4.o sha256_armv8_neon_x4.S cc -g -fpic -c -o sha256_armv8_neon_x1.o sha256_armv8_neon_x1.S cc -g -fpic -c -o sha256_armv8_crypto.o sha256_armv8_crypto.S ar rcs libhashtree.a sha256_armv8_neon_x4.o sha256_armv8_neon_x1.o sha256_armv8_crypto.o cc -g -Wall -Werror -DHAVE_OPENSSL -L . -o bench bench.c -lhashtree -lm -lcrypto make[1]: Leaving directory '/home/rock/hashtree/src' rock@proteus:~/hashtree$ src/bench [==========] Running 5 benchmarks. [ RUN ] armv8.neon_x1_one_at_time [ OK ] armv8.neon_x1_one_at_time (mean 38.769ms, confidence interval +- 0.008991%) [ RUN ] armv8.neon_x1 [ OK ] armv8.neon_x1 (mean 38.197ms, confidence interval +- 0.016511%) [ RUN ] armv8.neon_x4 [ OK ] armv8.neon_x4 (mean 24.244ms, confidence interval +- 0.009266%) [ RUN ] armv8.crypto [ OK ] armv8.crypto (mean 6.049ms, confidence interval +- 0.011850%) [ RUN ] openssl.openssl_one_at_time [ OK ] openssl.openssl_one_at_time (mean 10.020ms, confidence interval +- 0.521858%) [==========] 5 benchmarks ran. [ PASSED ] 5 benchmarks.
Means that we're considerably beating openssl using both crypto extensions.
(For reference this was on a Rock 5B which has SHA2 support)
This PR implements a routine for hashing on AARCH64 with crypto extensions (eg Apple's Silicon M1).
Fixes: #6