data61 / cuda-fixnum

Extended-precision modular arithmetic library that targets CUDA.
Other
41 stars 28 forks source link

Understand why CLNW sliding-window is faster than k-ary in the tests #43

Open unzvfu opened 6 years ago

unzvfu commented 6 years ago

When running the test suite, modexp (CLNW) seems faster than multi_modexp (k-ary) (at least in the 128 & 256 byte range), though this doesn't really make sense, since CLNW branches based on the bit pattern of the exponent whereas k-ary does not.

Work out what's going on. Replace k-ary if necessary.

unzvfu commented 4 years ago

Follow up at https://github.com/unzvfu/cuda-fixnum/issues/25.