Evaluate: Benchmarks to compare with/without this proposal

WebAssembly / wide-arithmetic

WebAssembly proposal for wide arithmetic

https://webassembly.github.io/wide-arithmetic/

Other

5 stars 1 forks source link

Evaluate: Benchmarks to compare with/without this proposal #3

Open alexcrichton opened 3 months ago

alexcrichton commented 3 months ago

Original development of this proposal benchmarked the blind-sig benchmark in Sightglass as well as the fibonacci benchmarks from the Rust num-bigint repository.

This issue is intended to serve as a location for others to drop interesting benchmark programs as well so they can be collected to help evaluate this proposal over time. If you've got a benchmark you'd like to see added it would ideally be in C or Rust at this time and is ideally a program that has a means of self-reporting its execution time. High-level ideas are ok to but would require some more work to create a reproducible benchmark.

alexcrichton commented 3 months ago

One area that would be particularly interesting to have benchmarks for are programs that require good performance of overflowing/saturating/checked arithmetic which isn't related to 128-bit. This would help stress the need for either 128-bit operations or overflow-flag-returning-instructions.

alexcrichton commented 3 months ago

A suggestion here is that -ftrapv can inject checked arithmetic for C and UBSan might rely on this heavily. A naive benchmark didn't show much performance difference relative to native without this proposal, however.

marcusdarmstrong commented 5 days ago

Just found out about this proposal today, but I've got a strong benchmark candidate here if you're still looking—XXH3 is critically reliant on a wide u64 mul operation for low-input-sizes, and thus our existing manual WASM implementation demonstrates worse performance than the older, non-vectorized XXH64 algorithm.

CryZe commented 5 days ago

Yeah, it's not just XXH3, but rustc-hash, foldhash, aHash, wyhash, rapidhash, MUM Hash, umash and many more that all rely on 128-bit widening multiplication, basically all the fastest non-failing hashing algorithms that don't use AES in the SMhasher benchmarks.

alexcrichton commented 5 days ago

Thanks @marcusdarmstrong and @CryZe! It'll be a bit easier to test and confirm in a few months once rustc and LLVM both support wide-arithmetic, but i64.mul_wide_u should be perfect for 128-bit-widening-multiplication. Historical benchmarks have all shown that the wasm instructions are suitable for matching native performance in these situations.