dalek-cryptography / curve25519-dalek

A pure-Rust implementation of group operations on Ristretto and Curve25519
Other
853 stars 422 forks source link

Elligator2 Forward and Reverse Mappings #612

Open jmwample opened 6 months ago

jmwample commented 6 months ago

Hello,

This is an implementation of the Elligator2 forward and reverse mappings --- points to representative values, as well as representative values to points. The specific goal is to make Elligator mapping for x25519 handshakes possible in pure rust.

This implementation is gated behind a feature flag "elligator2" in the curve25519_dalek crate for point transforms. I have tested this against the test vectors from the Kleshni/Elligator-2 C implementation as well as (a fork of) the agl/ed25519/extra25519 golang implementation. These can be seen in the test cases for this PR.

I also added a feature flag "elligator2" to the x25519_dalek crate that exposes a PublicRepresentative type that mirrors the PublicKey type in order to exchange only Elligator Representatives when performing a diffie-hellman handshake.

This implementation of the Elligator2 transforms (to the best of my ability):

I have seen a couple issues and PRs dealing with elligator2 mappings and this library, so I apologize if this is muddying the water . This is a feature that I really needed so I have implemented the functionality based on several existing implementations in other languages.

I am also not an expert in crypto implementations and would greatly appreciate any feedback. I hope this is helpful and potentially solid enough to include in the library. :+1:

Related to:

rozbb commented 6 months ago

Thank you! It will take a bit for me to take a look at this (trying to make some paper deadlines).

Re the IETF standard, hasn't that officially been standardized now? RFC 9380. Would be good to have test vectors from there.

jmwample commented 6 months ago

I will get those added. I haven't spent much time looking at that RFC yet, so I can try to work in the interface there as well (and get the rest of the tests passing).

jmwample commented 5 months ago

I have added the RFC9380 test vectors for the elligator2 implementation, and an interface that should work if someone wanted to use this to implement the h2c interface for curve25519 and/or edwards25519. Not sure if the full h2c implementation belongs in this crate or a different crate with a more general / uniform interface.

The CI checks should also be passing now.

randombit commented 4 months ago

Any chance of this landing soon?

jmwample commented 3 months ago

I am working on adding some final tests to ensure that the bits of the ellgator2 representatives appear as uniform random.

TLDR: The sqrt_ratio_i function seems to be canonical so this library shouldn't suffer from the described computational distinguisher. However, deriving an elligator2 representative only gives 254 bits of random to begin with, so the high order bits need handled in some way to prevent a trivial distinguisher.

The specific issue that this is testing for can be described as:

An instantiation of Elligator is parameterized by what might be called
a “canonical” square root function, one with the property that
`√a^2 = √(−a)^2` for all field elements `a`. That is, we designate just
over half the field elements as “non-negative,” and the image of the
square root function consists of exactly those elements. A convenient
definition of “non-negative” for Curve25519, suggested by its authors,
is the lower half of the field, the elements `{0, 1, …, (q − 1)+/+2}`.
When there are two options for a square root, take the smaller of the two.

Any Elligator implementation that does not do this canonicalization of the final square root, and instead maps a given input systematically to either its negative or non-negative root is vulnerable to the following computational distinguisher.

[An adversary could] observe a representative, interpret it as a field
element, square it, then take the square root using the same
non-canonical square root algorithm. With representatives produced by
an affected version of [the elligator2 implementation], the output of
the square-then-root operation would always match the input. With
random strings, the output would match only half the time.

The solution from the agl/ed25519 is to randomize the high two bits when getting the representative, and clear the those same high order bits when performing a map_to_curve operation. One challenge is determining how this aligns with the test vectors from other implementations (kleshni and signal specifically, the rfc9380 tests seem to be handled properly).

For a more in-depth explanation see:

This should not impact the general interface of the PR, and I am hoping to have the changes finished within the week.

jmwample commented 3 months ago

The latest commit fixes several issues.

  1. The Edwards RFC9380 testcases were not actually testing the things they were meant to be testing. This forced some changes in the way structure of the map_to_point functions as mapping to Montgomery, then to Edwards was missing a sign bit.

    • map_to_point for Edwards RFC9380 test cases now testing properly and passing
  2. The high order two bits of the representative are always 0 by default because correctly computed elligator2 representatives always finish with a sqrt() that takes the least-square-root value. That is, a value less than 2^254-10 (254 bits).

    • In order for the representative to be (optionally) indistinguishable from random we use a tweak byte to provide the extra randomness, added in when representative is created, and cleared when converting back to a point.
    • Both Kleshni & signal contain test cases that include non-least-square-root values which is not technically inline with the spec. In order to handle this (if interop is absolutely necessary) a map_to_point_unbounded() function is added that does not clear the high order bits before mapping to the curve.
    • A statistical test showing the effect that the tweak has on the apparent distribution of the bits over many representatives can be used to look at entropy based distinguishers (this does not necessarily help with computation based distinguishers).

I have no other changes planned for this PR without review / input.

jmwample commented 2 weeks ago

I have added another refactor to the elligator2 implementation motivated by feedback based on issues encountered with encoded key distinguishability in obfs4. The changes required to fix the distinguisher resulted in two versions of the elligator2 algorithm which are not interchangeable.

More information on the issue can be found here, the solution added to the Randomized variant is described in Method 1: add a random low order point. The RFC9380 variant is the default and follows the RFC including test vectors.

A test variant exists for legacy implementations that do not use least-square-root value of the representative (i.e. kleshni & signal), but it is not exposed by default.


For now I have published my fork as its own crate (see curve25519-elligator2), but my intention is to hopefully get this merged here and eventually yank the forked crate.

rozbb commented 4 days ago

Hi, thank you for this! I played around with this and had some notes:

  1. It seems like I can't get the Elligator2 tests to fail, even when they definitely should. For example, in src/field.rs, I replaced the return value of FieldElement::ct_gt with the constant Choice::from(0u8). Running cargo test --features "alloc,elligator2" did not produce any errors. Is a KAT missing?

  2. The reason I was playing with the above is because I was wondering about the correctness of the gt function in the fiat backend in the diff. My understanding is that this does a libgmp-style bigint subtraction without the subtraction. But the difference between a bigint and a fiat_25519_tight_field_element is that the latter has nontrivial equivalences. I'm worried gt might consider an unreduced fiat_25519_tight_field_element of value, say 2²⁵⁵ - 1 (equivalent to 18) as greater than 30, for example. I was trying to test this but ran into point (1) above.

I'll be reviewing the rest of the PR, but I think these should be addressed before merging. Thanks again!