Open max-te opened 4 months ago
I also ended up porting the SHA-256 algorithm from https://github.com/aws-samples/sha2-with-c-intrinsic/blob/master/src/sha256_compress_x86_64_avx.c and updated this PR. Here are updated benchmarks with simd:
test sha256_10 ... bench: 22 ns/iter (+/- 0) = 454 MB/s
test sha256_100 ... bench: 215 ns/iter (+/- 2) = 465 MB/s
test sha256_1000 ... bench: 1,959 ns/iter (+/- 8) = 510 MB/s
test sha256_10000 ... bench: 19,401 ns/iter (+/- 22) = 515 MB/s
test sha512_10 ... bench: 17 ns/iter (+/- 0) = 588 MB/s
test sha512_100 ... bench: 164 ns/iter (+/- 0) = 609 MB/s
test sha512_1000 ... bench: 1,476 ns/iter (+/- 2) = 677 MB/s
test sha512_10000 ... bench: 14,513 ns/iter (+/- 18) = 689 MB/s
This PR ports the AVX implementation of SHA-512 to simd128 and adds wasm32 testing in CI using wasmtime. Since wasm does not have feature detection, this backend is only used if the
-C target-feature=+simd128
flag is set.Benchmarks on AMD Ryzen 9 7950X3D, running with wasmtime, with simd:
without simd: