mit-plv / fiat-crypto

Cryptographic Primitive Code Generation by Fiat
http://adam.chlipala.net/papers/FiatCryptoSP19/FiatCryptoSP19.pdf
Other
720 stars 146 forks source link

Optimize & fix Edwards XYZT operations #1693

Closed bMacSwigg closed 1 year ago

bMacSwigg commented 1 year ago

This PR combines a few improvements to the EdwardsXYZT point operations, in both the Gallina & Bedrock versions.

Eliminates an addition by halving the precomputed coordinates

This PR switches to the following precomputed coordinate representation:

$\alpha=(y-x)/2, \beta=(y+x)/2, \gamma=dxy$

With this representation, we no longer need to compute D = 2*Z1 in the mixed addition function (https://hyperelliptic.org/EFD/g1p/data/twisted/extended-1/addition/madd-2008-hwcd-3)

This optimization is sourced from Section 3.2 of https://www.shiftleft.org/papers/fff/fff.pdf

Sometimes eliminates a multiplication by storing the components of T individually

Quoting from Section 3.2 of https://www.shiftleft.org/papers/fff/fff.pdf:

The last step of a doubling or addition with extended coordinates [19] amounts to computing T = T1 * T2. We skip that step and leave T1 and T2 in memory, saving a multiply. The T-coordinate is required for additions, so we compute it at the cost of a multiply when beginning an addition.

In this code, I call the components Ta and Tb instead, since we already use T1 and T2 (or X1 and X2, etc) to refer to the T coordinates of the two inputs to addition.

Fixes a bug in the calculation of T in m1add_precomputed_coordinates

Previously, we calculated `T = (A-B)(A+B) by:

X3 := A-B
Y3 := A+B
...
T3 := X3*Y3

However, we used the X3 and Y3 variables for both these intermediate values and the output values. So before the line T3 := X3*Y3, we were reassigning them:

X3 := X3*T3
Y3 := Y3*Z3

Which led to us getting the wrong result for T. This wasn't caught by the proof, because m1add_precomputed_coordinates doesn't use the point sigma type, and its spec didn't specify anything about T in the output.

This PR fixes this by using new different intermediate & output variables, rather than reusing X3 and Y3.

Note that this bug is not present in BoringSSL, because it has a separate output struct r rather than reusing X3 and Y3: https://boringssl.googlesource.com/boringssl/+/HEAD/third_party/fiat/curve25519_64_adx.h#599