Closed 3for closed 5 years ago
Nothing wrong, P256.pointAdd
is simply not good.
Contributions to improve are welcome, including benchmark code showing the issue (but can be simplified with throwCryptoError
).
P256 benchmark code added in Pull Requst 288.
Regarding execution time, the current P256.pointAdd
is especially slow because it uses scalar multiplication calls to convert both inputs to projective coordinates. Instead Z coordinate could just be set to kOne
.
And the call to point_add_or_double_vartime
could be changed to a function specialized for Z=1 like http://www.hyperelliptic.org/EFD/g1p/auto-shortw-jacobian-3.html#addition-mmadd-2007-bl. This will remove some more field operations.
In the end I don't think this can beat ECC
performance because P256
is only 32 bits. But performance will be much better than today.
@ocheron Thanks so much for your link. It's great. And I've removed the warnings following your advice in the Pull Requst 288.
Fixed in #291: P256.pointAdd does not use scalar multiplication anymore and is faster than P256.pointMul.
P256/pointAddTwoMuls-P256 mean 898.9 μs ( +- 865.8 ns )
P256/pointAdd-P256 mean 27.98 μs ( +- 45.26 ns )
P256/pointMul-P256 mean 435.6 μs ( +- 744.6 ns )
P256.pointAdd is still slower than ECC.pointAdd, mostly due to doing field inversion in constant time (whereas ECC.pointAdd uses variable-time Euclid).
I'm benching the P256.hs and Ecc.hs
pointAdd
function. It's shown thatpointAdd
in P256 ( 1.131 ms) is much less efficient than in Ecc (3.961 μs), whilepointMul
in P256 is more efficient than in Ecc. Is there anything wrong with my bench code?bench result is: