Closed bkomuves closed 2 days ago
Thanks for the repro.
It seems like it's once again the endomorphism acceleration that is problematic.
Using multiScalarMul_reference_vartime
from https://github.com/mratsim/constantine/blob/c7979b003372b329dd450ff152bb5945cafb0db5/constantine/math/elliptic/ec_multi_scalar_mul.nim#L95-L97 works
And removing withEndo
from https://github.com/mratsim/constantine/blob/c7979b003372b329dd450ff152bb5945cafb0db5/constantine/math/elliptic/ec_multi_scalar_mul.nim#L406-L437 works as well.
One thing really strange is that I tried to slice the input from 0 to 22000 or 1000 to 23000 and having too few points make the bug disappear.
So how this bug was discovered was really strange. These are the B2 points from a Groth16 proof. The circuit is essentially doing Poseidon2 hashes. The proof always succeeded below a given circuit size and always failed above a given size, no matter what the inputs are, what the exact structure of these hashes are (linear sponge, Merkle tree, independent permutation calls), this threshold even survived changing the number of the rounds in the Poseidon permutation. However with completely different circuits, much larger circuits worked fine too...
multiScalarMul_reference_vartime
seems to be a good workaround, but it's more than 3x slower, becoming the primary bottleneck (this single MSM becoming more than half of the total runtime of a Groth16 proof).
Further investigation shows that when 22528 points work while 22529 points doesn't.
This is the exact value for which c, the bucket size, changes from 12 to 13.
I found the bug:
It is the w -= c
outlined there, which should be a discard following the comment just above
This bug is hard to trigger but can be triggered reliably once understood, here is the sequence of events.
huh, nice detective work!
This has been merged.
Note that I did several refactorings of the internals in preparation for the first release of Constantine and support for named fields and not just named curves. Here are the important changes:
C
parameter from C: static Curve
has been renamed A: static Algebra
ECP_ShortW_***
renamed to EC_ShortW_***
isInf
/setInf
renamed to isNeutral
/setNeutral
curves_primitives.nim
file has been split to lowlevel_***.nim
Full details in #399, #400, #402
Wrong result for a particular G2 MSM calculation (this time
multiScalarMul_vartime
is affected).Unfortunately I don't have a small example (so far the smallest one is ~24,000 elements) and I don't have the energy or patience to try find a minimal one, so I just created a repo with all the data and code to reproduce.
Two independent implementations gives the correct result, and also constantine gives the correct result for smaller prefix vectors (I haven't done exhaustive testing here, but for example the first 10,000 elements is fine)
c7979b0
(but the bug should be present for at least a few months)