More performant computation of the multiplicative inverse mod 2^n

The Inverse method of the Modulus class can be optimized to become more performant. The current implementation uses the Euler's theorem, requires O(log^3(M)) time and O(log(M)) space, where the modulus M = 2^B: https://github.com/kroma-network/tachyon/blob/bbbabd04f7ccd1c32eb7b80884777be806d61068/tachyon/math/finite_fields/modulus.h#L69

There are at least 2 methods, which can be used instead:

Inversion by means of the Extended Euclidean Algorithm. It is a general purpose method for an arbitrary modulus. An example for small numbers with explanation and proof can be found here: https://github.com/taikoxyz/research/blob/main/algebra/Proof%20and%20Implementation%20of%20Euclidean%20Inversion.rs
The Hurchalla's method for computing the multiplicative inverse modulo a power of two. The original Hurchalla's paper contains the explanation, proof and implementation in C for small numbers: https://arxiv.org/pdf/2204.04342 For the ones interested in theory I can give more straightforward non-inductive proof.

For large numbers (arbitrary-precision arithmetic) both methods would require O(log^2(M)) time and O(log(M)) space (for the first method it is a well-known result, for the Hurchalla's one I can give my proof sketch). However, the Hurchalla's method is more convenient to implement in this case, and I hypothesize that it can be a bit more time efficient.

PS The Hurchalla's method is extremely efficient for small numbers (no arbitrary-precision arithmetic): requires O(log log M) time, while Euler's/Carmichael theorem-based and Euclidean-based methods require O(log M) time. The space complexity for all these methods in the case of small numbers is O(1).

kroma-network / tachyon

More performant computation of the multiplicative inverse mod 2^n #455