RustCrypto / crypto-bigint

Cryptography-oriented big integer library with constant-time, stack-allocated (no_std-friendly) implementations of modern formulas
Apache License 2.0
167 stars 45 forks source link

Faster vartime division for Uint #608

Open andrewwhitehead opened 2 weeks ago

andrewwhitehead commented 2 weeks ago

Implements faster vartime division (vartime with the divisor only) for Uint based on Knuth's TAOCP volume 2, as outlined at https://janmr.com/blog/2014/04/basic-multiple-precision-long-division/

This does not address vartime division for BoxedUint or the other TODOs in https://github.com/RustCrypto/crypto-bigint/pull/511

Relevant benchmarks:

wrapping ops/div/rem_vartime, U256/U128, full size
                        time:   [84.091 ns 84.290 ns 84.525 ns]
                        change: [-85.312% -85.155% -84.896%] (p = 0.00 < 0.05)
                        Performance has improved.
wrapping ops/rem_vartime, U256/U128, full size
                        time:   [84.129 ns 84.474 ns 84.890 ns]
                        change: [-84.208% -84.057% -83.898%] (p = 0.00 < 0.05)
                        Performance has improved.
wrapping ops/rem_wide_vartime, U256
                        time:   [166.95 ns 167.27 ns 167.69 ns]
                        change: [-94.750% -94.705% -94.636%] (p = 0.00 < 0.05)
                        Performance has improved.
Dynamic Montgomery arithmetic/MontyParams::new_vartime
                        time:   [459.28 ns 460.48 ns 461.88 ns]
                        change: [-80.512% -80.452% -80.390%] (p = 0.00 < 0.05)
                        Performance has improved.
andrewwhitehead commented 2 weeks ago

It looks like Uint::mul_mod would also be 90% faster using this update and split_mut/rem_wide_vartime, and could accept an even modulus. The current implementation uses MontyParams::new_vartime so it seems like it is vartime for the modulus, but maybe that should be renamed mul_mod_vartime anyway? Accepting a &NonZero<Uint> for the modulus would also simplify things.