Closed PoslavskySV closed 7 years ago
Great question. Is the concern overflow in the intermediate calculation a*b
, or is it achieving better performance? (Or both?). Also, is it a typo that the return type is unsigned but parameters are unsigned, or is that part of it?
Yes, both better performance and overflow in the intermediate calculation are important. I've fixed the typo in the question -- the return type is also unsigned. (btw, as far as I understand, if the performance is not important one can use libdivide_128_div_64_to_64
method). Thanks!
You may be interested in my related SO question. Do we know anything about denom
here - can we place an upper or lower bound, is it prime, etc?
We know that denom
is prime and 2 <= count_leading_zeros64(denom) <= 62
, can this help?
With the modulus prime and used repeatedly, you might be able to apply Montgomery reduction. That's the best I can think of.
I also came up with this; just hoped that it is possible to reuse libdivide code. Thanks!
Is there a simple way to implement modular multiplication of two 64-bit integers modulus other 64-bit integer with preconditioning? I mean the method like:
which returns
(a * b) % denom
.