prysmaticlabs / hashtree

SHA256 library highly optimized for Merkle tree computations
MIT License
27 stars 7 forks source link

Implement AARCH64 with SHA2 #8

Closed potuz closed 1 year ago

potuz commented 1 year ago

This PR implements a routine for hashing on AARCH64 with crypto extensions (eg Apple's Silicon M1).

Fixes: #6

arnetheduck commented 1 year ago

Fantastic! Passes tests on rpi4b+, once I disable the crypto-extension ones:

Test hash_armv8_neon_one_block...               [ OK ]
Test hash_armv8_neon_multiple_blocks...         [ OK ]
Test hash_armv8_neon_x4...                      [ OK ]
SUCCESS: All unit tests have passed.

Benchmark, same raspi:

over_voltage=4
arm_freq=1900
[==========] Running 4 benchmarks.
[ RUN      ] armv8.neon_x1_one_at_time
[       OK ] armv8.neon_x1_one_at_time (mean 63.132ms, confidence interval +- 0.150820%)
[ RUN      ] armv8.neon_x1
[       OK ] armv8.neon_x1 (mean 62.639ms, confidence interval +- 0.055984%)
[ RUN      ] armv8.neon_x4
[       OK ] armv8.neon_x4 (mean 46.197ms, confidence interval +- 0.040853%)
[ RUN      ] openssl.openssl_one_at_time
[       OK ] openssl.openssl_one_at_time (mean 161.603ms, confidence interval +- 0.166258%)
[==========] 4 benchmarks ran.
[  PASSED  ] 4 benchmarks.
arnetheduck commented 1 year ago

It does not check at runtime for sha instructions in tests so tests will sigill on machines that do not support sha2 extensions (eg RPI4) unlike the amd64 counterpart that would notify the user.

this small patch avoids the problematic parts when testing on raspi, fwiw:

diff --git a/src/bench.c b/src/bench.c
index 5548125..6058884 100644
--- a/src/bench.c
+++ b/src/bench.c
@@ -74,6 +74,8 @@ UBENCH_EX(armv8, neon_x4) {
     free(digest);
 }

+#ifdef __ARM_FEATURE_CRYPTO
+
 UBENCH_EX(armv8, crypto) {
     int * buffer  = (int *) malloc(buffer_size);
     unsigned char * digest = (unsigned char *) malloc(buffer_size/2);
@@ -87,6 +89,8 @@ UBENCH_EX(armv8, crypto) {
     free(digest);
 }

+#endif
+
 #endif
 #ifdef __x86_64__
 UBENCH_EX(sse, sse_x1_one_at_time) {
diff --git a/src/test.c b/src/test.c
index 66f803e..69439e4 100644
--- a/src/test.c
+++ b/src/test.c
@@ -622,7 +622,9 @@ TEST_LIST = {
     {"hash_armv8_neon_one_block", test_hash_armv8_neon_x1_one_block},
     {"hash_armv8_neon_multiple_blocks", test_hash_armv8_neon_x1_multiple_blocks},
     {"hash_armv8_neon_x4", test_armv8_neon_x4},
+#ifdef __ARM_FEATURE_CRYPTO
     {"hash_armv8_crypto_multiple_blocks", test_hash_armv8_crypto_multiple_blocks},
+#endif
 #endif
     {NULL, NULL}
 };
potuz commented 1 year ago

Adding here benches courtesy of Joe Clapis

rock@proteus:~/hashtree$ HAVE_OPENSSL=1 make  bench
make -C src bench
make[1]: Entering directory '/home/rock/hashtree/src'
cc -g -fpic   -c -o sha256_armv8_neon_x4.o sha256_armv8_neon_x4.S
cc -g -fpic   -c -o sha256_armv8_neon_x1.o sha256_armv8_neon_x1.S
cc -g -fpic   -c -o sha256_armv8_crypto.o sha256_armv8_crypto.S
ar rcs libhashtree.a sha256_armv8_neon_x4.o sha256_armv8_neon_x1.o sha256_armv8_crypto.o 
cc -g -Wall -Werror -DHAVE_OPENSSL -L . -o bench bench.c -lhashtree -lm -lcrypto
make[1]: Leaving directory '/home/rock/hashtree/src'
rock@proteus:~/hashtree$ src/bench
[==========] Running 5 benchmarks.
[ RUN      ] armv8.neon_x1_one_at_time
[       OK ] armv8.neon_x1_one_at_time (mean 38.769ms, confidence interval +- 0.008991%)
[ RUN      ] armv8.neon_x1
[       OK ] armv8.neon_x1 (mean 38.197ms, confidence interval +- 0.016511%)
[ RUN      ] armv8.neon_x4
[       OK ] armv8.neon_x4 (mean 24.244ms, confidence interval +- 0.009266%)
[ RUN      ] armv8.crypto
[       OK ] armv8.crypto (mean 6.049ms, confidence interval +- 0.011850%)
[ RUN      ] openssl.openssl_one_at_time
[       OK ] openssl.openssl_one_at_time (mean 10.020ms, confidence interval +- 0.521858%)
[==========] 5 benchmarks ran.
[  PASSED  ] 5 benchmarks.

Means that we're considerably beating openssl using both crypto extensions.

jclapis commented 1 year ago

Adding here benches courtesy of Joe Clapis

rock@proteus:~/hashtree$ HAVE_OPENSSL=1 make  bench
make -C src bench
make[1]: Entering directory '/home/rock/hashtree/src'
cc -g -fpic   -c -o sha256_armv8_neon_x4.o sha256_armv8_neon_x4.S
cc -g -fpic   -c -o sha256_armv8_neon_x1.o sha256_armv8_neon_x1.S
cc -g -fpic   -c -o sha256_armv8_crypto.o sha256_armv8_crypto.S
ar rcs libhashtree.a sha256_armv8_neon_x4.o sha256_armv8_neon_x1.o sha256_armv8_crypto.o 
cc -g -Wall -Werror -DHAVE_OPENSSL -L . -o bench bench.c -lhashtree -lm -lcrypto
make[1]: Leaving directory '/home/rock/hashtree/src'
rock@proteus:~/hashtree$ src/bench
[==========] Running 5 benchmarks.
[ RUN      ] armv8.neon_x1_one_at_time
[       OK ] armv8.neon_x1_one_at_time (mean 38.769ms, confidence interval +- 0.008991%)
[ RUN      ] armv8.neon_x1
[       OK ] armv8.neon_x1 (mean 38.197ms, confidence interval +- 0.016511%)
[ RUN      ] armv8.neon_x4
[       OK ] armv8.neon_x4 (mean 24.244ms, confidence interval +- 0.009266%)
[ RUN      ] armv8.crypto
[       OK ] armv8.crypto (mean 6.049ms, confidence interval +- 0.011850%)
[ RUN      ] openssl.openssl_one_at_time
[       OK ] openssl.openssl_one_at_time (mean 10.020ms, confidence interval +- 0.521858%)
[==========] 5 benchmarks ran.
[  PASSED  ] 5 benchmarks.

Means that we're considerably beating openssl using both crypto extensions.

(For reference this was on a Rock 5B which has SHA2 support)