wolfSSL / wolfssl

The wolfSSL library is a small, fast, portable implementation of TLS/SSL for embedded devices to the cloud. wolfSSL supports up to TLS 1.3 and DTLS 1.3!
https://www.wolfssl.com
GNU General Public License v2.0
2.29k stars 818 forks source link

Trying to understand the implementation of the function - ge_double_scalarmult_vartime #7554

Closed iontra-shubham closed 4 months ago

iontra-shubham commented 4 months ago

https://github.com/wolfSSL/wolfssl/blob/7782f8eed210d8869ae70f2efd7545b59cb25adf/wolfcrypt/src/ed25519.c#L783

I tried printing some values inside (first ed25519_smult-> first iteration first ed25519_double) https://github.com/wolfSSL/wolfssl/blob/7782f8eed210d8869ae70f2efd7545b59cb25adf/wolfcrypt/src/ge_low_mem.c#L525 https://github.com/wolfSSL/wolfssl/blob/7782f8eed210d8869ae70f2efd7545b59cb25adf/wolfcrypt/src/ge_low_mem.c#L426 https://github.com/wolfSSL/wolfssl/blob/7782f8eed210d8869ae70f2efd7545b59cb25adf/wolfcrypt/src/ge_low_mem.c#L347

/ A = X1^2 / a = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 / B = Y1^2 / b = 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 / C = 2 Z1^2 / c = 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 / X1+Y1 / X1+Y1 = 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 / (X1+Y1)^2 / (X1+Y1)^2 = 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 / (X1+Y1)^2 - A / e1 = ee ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f b = 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 / E = (X1+Y1)^2 - A - B / e = ed ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f / G = D + B / g = ee ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f / F = G - C / f = ec ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f / H = D - B / h = ec ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f

r.X = ed ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f r.Y = ec ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f r.T = ed ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f r.Z = ec ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7f

Questions: 1) When we calculate / (X1+Y1)^2 - A / then we get / (X1+Y1)^2 + p - A /, but in lm_sub function it is mentioned in comments that we actually calculate a + 2p - b, so there is contradiction here. 2) When we calculate r.X = e * f, then the output = p, Why is that? Are we saturating the multiplication to p? How is this fe_mul__distinct behaving here?

SparkiDev commented 4 months ago

Hi @iontra-shubham

The purpose of ge_double_scalarmult_vartime is to calculate two scalar multiplications and add the results.

In this case the two scalar multiplications are:

  1. sig base ponit: S G
  2. h inA: Hash(R, A, M) (- public key)

The questions are around the implementation of difference and multiplication of two scalars modulo the prime in the low memory case. fe_low_mem.c:385: lm_sub - calculating r = (a - b) mod p. fe_low_mem.c:435:fe_mul__distinct - calculating r = (a * b) mod p.

For lm_sub, in order to avoid underflow, a has 2 p (the prime of the curve) added to it during the subtraction calculation. Then, at the end of lm_sub, the result of a + 2p - b is reduced by the prime. The reduction is a partial reduction and may result in the value being greater than p but only by a small amount. 2 p is added due to the fact that partial reductions are done in most operations and therefore b may be greater than p but less than 2 * p.

The fe_mul__distinct is also not doing a full reduction and so the value can be greater than the prime but not by much.

Later a full reduction is done when converting the result to bytes: ge_tobytes(). ge_low_mem.c:456-457 - calling fe_normalize on x and y. fe_low_mem.c:313:fe_normalize().

Does this answer your question?

Sean

iontra-shubham commented 4 months ago

Thanks @SparkiDev for the detailed clarification. I understood what is happening in the code. I could not figure out the partial reduction part.

SparkiDev commented 4 months ago

Hi @iontra-shubham,

Partial reduction makes the code faster but does confuse things! If you have more questions please raise a new ticket.

Sean