Open arcfide opened 9 years ago
This will require algorithms for the gcd, modular reduction, and modular inverses, all implemented in uniform time. There are several possible algorithms for each, but many of them have branching which is not acceptable. Below are the algorithms that seem most likely to be written in uniform time:
(1) Extended binary gcd of positive integers (2) Montgomery reduction of multiprecision integers (3) Barret method for modular reduction (4) Plus-minus inversion method (basically an adaptation of (1)) (5) Montgomery inverse (closely connected with (2) and the Montgomery Ladder)
Implementing the Chinese Remainder Theorem at this point will also improve efficiency, since reduction mod p-1 and mod q-1 will be faster than reduction mod (p-1)(q-1).
As mentioned in the issue for Steps 1 and 2, base conversion is slow. We will attempt to avoid base conversions entirely, by working with bit vectors. Both Montgomery and Barret reduction for binary arrays seem reasonable to implement in APL. In particular for Barret reduction most of the complicated looking divide and take the floor operations are simply drops. But since Barret reduction requires a separate calculation of 1/N using Newton's method, it seems likely that Montgomery reduction will be significantly easier.