inversion doubts - Githubissues

bbbrumley commented 3 years ago

Howdy Folks,

So I have this ECCKiila branch here utilizing the divstep functionality for GF inversion. Everything works after tweaking some constants, but I don't know why.

In the test example:

https://github.com/mit-plv/fiat-crypto/blob/master/inversion/c/inversion_test_template.c

I don't understand what's going on. It calls FROM_MONTGOMERY, then does the inverse, then calls FROM_MONTGOMERY again. I don't see a "to montgomery" anywhere.

What is the test doing?
What are the assumptions?
What form does it expect the input in?
What form does it leave the output in?

To make your template work, I had to put a "double" Montgomery factor in the divstep precomp constant. You can see it here

https://gitlab.com/nisec/ecckiila/-/commit/1b3c6e878fb76c02b4c5a48f8535f301dc3a1d10#f1bdda93d9a278e358509d498e17d97764c1fb29_316_322

So it is a different divstep precomp constant than what fiat-crypto emits -- I have to do "to Montgomery" on that constant, even if it is already in Montgomery form. (So it has two Montgomery factors.)

Why?

The ECCKiila assumptions are

The "element to be inverted" argument is already in Montgomery form.
The "inverse" output should be left in Montgomery form -- there's more computation to do.

I feel like these are standard assumptions for applied ECC.

Any insight?

Tagging @jedisct1

jedisct1 commented 3 years ago

The zig version may bit a little bit clearer.

The input to the inversion function is expected to already be in Montgomery form.

bbbrumley commented 3 years ago

I see -- so your implementation starts by converting from the Montgomery form to canonical form.

Why can't you do everything in Montgomery form? That is what the ECCKiila code is suggesting to be more efficient.

bbbrumley commented 3 years ago

I think I get it now.

When you stay in the Montgomery domain and invert, you also invert the Montgomery factor. So to correct that and leave the output in Montgomery form, you need two Montgomery factors when leaving the function -- the first one cancels.

I'd argue this is a bug in fiat-crypto. You can roll all of this into the divstep precomp constant. Instead of all the conversion to / from Montgomery form.

But am I the only one that sees it that way?

bbbrumley commented 3 years ago

@dfaranha you made contributions related to this, right? Any thoughts?

dfaranha commented 3 years ago

Hey, I was not aware of this issue. Thanks for reporting!

Yes, we can bundle these conversions to the divstep precomp constant, although the gains in performance will be small. @bshvass is preparing a patch containing the faster jumpdivstep code, so it's something we can improve there.

bbbrumley commented 3 years ago

Perfect -- thanks, Diego :) Especially for confirming I haven't gone totally insane. (Only the normal insane that comes with the academic territory.)

I'll go ahead and merge that ECCKiila branch to master there, then. If it makes you feel any better, there's a ton of CI testing going on there.

We'll also package the changes downstream for Mozilla, so they get the P-384 updates. (Unfortunately P-521 uses the unsaturated Solinas representation, which doesn't seem to be currently compatible with this fiat-crypto inversion code.)

Thank you for your contribution!

bshvass commented 3 years ago

@bbbrumley just to explain the original question: The inverse function does not expect its input to be in montgomery form, but the MUL function does expect this. This is why the test stores the montgomery form of g: to be able to test multiplication by the inverse later (in the lines you refer to, g1 refers to g in montgomery form). However, inverse does return the inverse in montgomery form. I agree that this is weird behaviour, and I will change this (or at least document the behaviour). Thanks for reporting!

emmansun commented 3 months ago

The performance of the current inversion implementation seems not to be very good compared to the exponentiation by n - 2 implementation optimized using addchain. I tested SM2 prime number m = 0xfffffffeffffffffffffffffffffffffffffffff00000000ffffffffffffffff (from "2^256 - 2^224 - 2^96 + 2^64 - 1"), the performance is just ~30%.

dfaranha commented 3 months ago

The initial version that was included in Fiat-Crypto is definitely not optimal. We have a much faster version based on jumpdivsteps sitting on @bshvass GitHub and waiting to be merged. Both of them have a better chance at competing when performing arithmetic modulo dense primes, where the FLT method performs poorly due to lack of short addition chains and slower squaring/multiply.

For reference, see our CSF'23 paper "High-Assurance Field Inversion for Curve-Based Cryptography" for the concrete numbers.

mit-plv / fiat-crypto

inversion doubts #1020