Closed max-te closed 3 weeks ago
I also ended up porting the SHA-256 algorithm from https://github.com/aws-samples/sha2-with-c-intrinsic/blob/master/src/sha256_compress_x86_64_avx.c and updated this PR. Here are updated benchmarks with simd:
test sha256_10 ... bench: 22 ns/iter (+/- 0) = 454 MB/s
test sha256_100 ... bench: 215 ns/iter (+/- 2) = 465 MB/s
test sha256_1000 ... bench: 1,959 ns/iter (+/- 8) = 510 MB/s
test sha256_10000 ... bench: 19,401 ns/iter (+/- 22) = 515 MB/s
test sha512_10 ... bench: 17 ns/iter (+/- 0) = 588 MB/s
test sha512_100 ... bench: 164 ns/iter (+/- 0) = 609 MB/s
test sha512_1000 ... bench: 1,476 ns/iter (+/- 2) = 677 MB/s
test sha512_10000 ... bench: 14,513 ns/iter (+/- 18) = 689 MB/s
What's the status of this?
This is awaiting review.
@newpavlov Do you mind taking a look at this?
What are advantages of the explicit SIMD backend in the SHA256 case? It has the same performance as the soft backend.
Did you look at the second comment in this PR? It seems like initially there was no SIMD algorithm used for SHA-256 in the initial version of this PR but that changed the next day as indicated by the second comment.
Unless you of course did some more benchmarking and it's indeed not faster anymore.
Ah, I indeed missed the second comment. I think it's worth to update OP since its text will be included in the merge commit message.
@max-te
I think we are good to merge. But could you measure performance of the software backend with enabled simd128
target feature on the same hardware? You can do it with this command:
RUSTFLAGS='--cfg sha2_backend="soft" -C target-feature=+simd128' cargo +nightly bench --target wasm32-wasi
I would like to add these results to the merge commit message.
Sure, I added that benchmark to the PR description and updated the other ones.
Thank you!
This PR ports the AVX implementation of SHA-512 to simd128. It also implements the related version of SHA-256 from https://github.com/aws-samples/sha2-with-c-intrinsic/blob/master/src/sha256_compress_x86_64_avx.c in simd128. Also added wasm32 testing in CI using wasmtime. Since wasm does not have feature detection, this backend is only used if the
-C target-feature=+simd128
flag is set.Benchmarks on AMD Ryzen 9 7950X3D, running with wasmtime 26.0.0 (c92317bcc 2024-10-22) on rustc 1.84.0-nightly (b3f75cc87 2024-11-02):