The REDC() implementation can be optimized by combining three overflow checks.
First of all the add(tHi, sub(0, n)) is the same as sub(tHi, n).
The two overflows marked by the single variable tHiOverflowed cannot happen simultaneously. This is a property of multi-precision arithmetic: if the first overflow happens the tHi is at most 2^256-2 and therefore the addition of 1 never overflows. These two checks can be combined by or.
The check tHi >= n (i.e. iszero(lt(tHi, n))) can also be combined with the previous two by or. This is because "t is less than 2N, and because it's an integer, this puts t in the range [0, 2N − 1]. Therefore, reducing t into the desired range requires a single subtraction, so the algorithm's output lies in the correct range." https://en.wikipedia.org/wiki/Montgomery_modular_multiplication#The_REDC_algorithm
Recommendation:
Apply the following code modification to REDC(). In the p256verify_valid_signature_one test this reduced gas usage by ~12% from 1610051 to 1424273.
diff --git a/precompiles/P256VERIFY.yul b/precompiles/P256VERIFY.yul
index 4efaf8d..fb714b3 100644
--- a/precompiles/P256VERIFY.yul
+++ b/precompiles/P256VERIFY.yul
@@ -253,22 +253,15 @@ object "P256VERIFY" {
/// @return S The result of the Montgomery reduction.
function REDC(TLo, THi, n, nPrime) -> S {
let m := mul(TLo, nPrime)
- let tHi, tHiOverflowed := overflowingAdd(THi, getHighestHalfOfMultiplication(m, n))
+ let tHi, tHiOverflowed1 := overflowingAdd(THi, getHighestHalfOfMultiplication(m, n))
let aLo, aLoOverflowed := overflowingAdd(TLo, mul(m, n))
- if tHiOverflowed {
- // TODO: Check if this addition could overflow.
- tHi := add(tHi, sub(0, n))
- }
+ let tHiOverflowed2 := 0
if aLoOverflowed {
- tHi, tHiOverflowed := overflowingAdd(tHi, 1)
- }
- if tHiOverflowed {
- // TODO: Check if this addition could overflow.
- tHi := add(tHi, sub(0, n))
+ tHi, tHiOverflowed2 := overflowingAdd(tHi, 1)
}
- S := tHi
- if iszero(lt(tHi, n)) {
+ S := tHi
+ if or(or(tHiOverflowed1, tHiOverflowed2), iszero(lt(tHi, n))) {
S := sub(tHi, n)
}
}
Context: P256VERIFY.yul#L235
Description:
The
REDC()
implementation can be optimized by combining three overflow checks.add(tHi, sub(0, n))
is the same assub(tHi, n)
.tHiOverflowed
cannot happen simultaneously. This is a property of multi-precision arithmetic: if the first overflow happens thetHi
is at most2^256-2
and therefore the addition of 1 never overflows. These two checks can be combined by or.tHi >= n (i.e. iszero(lt(tHi, n)))
can also be combined with the previous two byor
. This is because "t is less than 2N, and because it's an integer, this puts t in the range [0, 2N − 1]. Therefore, reducing t into the desired range requires a single subtraction, so the algorithm's output lies in the correct range." https://en.wikipedia.org/wiki/Montgomery_modular_multiplication#The_REDC_algorithmRecommendation:
Apply the following code modification to
REDC()
. In thep256verify_valid_signature_one
test this reduced gas usage by ~12% from 1610051 to 1424273.