Closed scrambler-crypto closed 1 year ago
Hey @scrambler-crypto, The formula execution in pyecsca was not really optimized for speed and is currently kind of a bottleneck. This is due to many things I suspect the following are the biggest culprits:
Z1 = 1
or similar ones that compute a value of a parameter that is then used in the formula. These assumptions are validated using sympy and symbolic arithmetic which might be quite slow. I will look into a way to skip this validation, if care is taken to avoid breaking assumptions when calling the formulas.Mod
implementation is slower than the GMP one for many things. Installing the GMP library and the gmpy2
package should auto-enable the faster GMPMod
.NullContext
should be the fastest.Could you provide some more details on the code you are running and the performance you are getting? Just to make sure that you are not hitting some weird issue that seriously degrades performance and could be fixed. Something like "I'm running addition formula calls over this curve and they take x milliseconds each".
TODOs for me:
Mod
cache?)FormulaAction
creation to be all done against a context with individual OpActions
so that it is also skipped if a NullContext
is used. Currently even with a NullContext
set a FormulaAction
is created, entered and constructed with all the intermediate values and then immediately thrown away.Hi, @J08nY thanks for the reply. I installed gmpy2 since I saw it in the code and appreciated the improved speed. I also understand that this project has not the speed as its first aim, but I really like it because of its "malleability", I can represent a very large number of combinations of formulas.
I do not have precise numbers to answer but, roughly to compute with known and all took about 10 minutes on secp128r1 with projective coordinates (I forgot to say that I did that scalar mult for several (few hundreds) of Points P).
I sped it up a little coding a "partial multiplication" function that reuse the result and continue the scalar multiplication.
I am working on performance evaluation and also some quite nice performance improvements in the Mod implementation. I will look at profiling the whole scalar multiplication and what takes the most time there. Expect improvement soon 🙂.
Hi, I am very interested in the performance improvements, and I would like to contribute, if you have some tasks that are simple enough for me.
Now there is a framework for benchmarking pyecsca (run make perf
but make sure you installed the new dev
dependencies as defined in setup.py) it is possible to analyze the runtime. I have only added benchmarks for Modular arithmetic, formula calls and a basic scalar multiplier, without getting into the Context and tracing stuff which I expect to be a significant overhead.
Currently it seems the largest overhead is various validations, type checks and casts around the Mod class. Its a tricky class that is able to gracefully handle things like 5 * Mod(3, 7) -> Mod(1, 7)
by casting the 5
, but these kinds of runtime casts and checks do create some overhead, also the check for the same modulus (such that Mod(5, 7) * Mod(5, 11)
raises an exception) also creates a bit of an overhead. Some of these checks can be removed, but some must stay for the time being. I am thinking of a few speedups where the formulas could be pre-processed such that no raw integers remain in them and then these casts will be unnecessary. When looking at the formula calls and scalar multiplication, the overhead from tracing the Actions is obvious where out of 30 seconds of operations, like 5 is spent in Action handling code, even with the NullContext
set. This is could be solved as suggested in my last point on my TODO list above.
I have also implemented a few speedups in the modular arithmetic of both the GMP and raw Python Mod implementation (square roots and is_residue
computations should be much faster now).
Long-term I am thinking of moving performance critical parts of pyecsca (modular arithmetic, formula execution, context and tracing) to use Cython, which could provide a speedup.
Trying to use this framework to generate a set of guess and try to reverse a scalar mult algorithm I am studying. The call of a formula (call) is very time consuming. Is there a way to optimize or cache the formula and speed up the computation?
As context I'm calling: some
mult.multiply(guess)
and somemult._add(Point_a, Point_b)
. Am I correct? or is there a better way to get a fast multiplication and sum of two points?Thanks in advance.