open-quantum-safe / openssl

UNSUPPORTED Fork of OpenSSL 1.1.1 that includes prototype quantum-resistant algorithms and ciphersuites based on liboqs PLEASE SWITCH TO OQS-Provider for OpenSSL 3
https://openquantumsafe.org/
Other
289 stars 125 forks source link

Some BIKE ciphersuites not reliable in TLS 1.3 #42

Closed dstebila closed 5 years ago

dstebila commented 5 years ago

Some BIKE ciphersuites (e.g., bike2l1, bike2l3, bike2l5, bike3l1, bike3l3, bike3l5) are not reliable for me in TLS 1.3 on my Mac. About 50% of the time, the connection goes through, and 50% of the time I get the following error:

CONNECTED(00000006)
4452943296:error:14200044:SSL routines:add_key_share:internal error:ssl/statem/extensions_clnt.c:626:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 7 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---

I'm not sure what the problem is.

Any ideas, @christianpaquin?

christianpaquin commented 5 years ago

I'll take a look

christianpaquin commented 5 years ago

I only get errors with the bike2 algs, this is the failing code from src/kem/bike/x86_64/kem.c : // Check if error weight equals T1: `if (getHammingWeight(e, 2 R_BITS) != T1)`

I'm investigating to see if it's a problem with the OQS implementation or the call from OpenSSL.

christianpaquin commented 5 years ago

I can only repro the bug with the bike v2 schemes; they fail roughly 50% of the time. I can't repro in OQS alone, so there must be something in the way OpenSSL invokes it (or perhaps a circular dependency from OpenSSL to OQS to OpenSSL?) I reviewed the calls from OpenSSL into OQS, and I can't find an error; per the above comment, the hamming weight seems to be different some times. I don't understand the crypto, so a BIKE dev should look at the input/output to see what is going on.

We can pull the bike2* schemes from the release if we can't fix this this week, and investigate later.

Shay-Gueron commented 5 years ago

Here are some thoughts for now: Since the problem appeared only with BIKE2 and only on a MAC OS (i.e., what's running is the C reference code and not the assembler), it seems that the issue is with how the BIKE2 keys are handled.

As an algorithm, BIKE2 has a shorter (half size) key compared to BIKE1/BIKE3. For consistency (with the KAT's) the reference code extends (artificially) the keys in the BIKE2 case, and ignores the redundant half. It might be that when linking to OpenSSL, the Hamming weight is counted over the full length.

Need to check this out and fix. We will get back to you on this soon (before the release).

christianpaquin commented 5 years ago

FYI, I encountered the same behavior on Ubuntu 16.04.5 LTS using gcc. Repro steps:

Follow the README.md instructions in the OQS-OpenSSL_1_1_1 branch to:

  1. build liboqs (master).
  2. generate a test cert: apps/openssl req -x509 -new -newkey rsa -keyout rsa.key -out rsa.crt -nodes -subj "/CN=oqstest" -days 365 -config apps/openssl.cnf
  3. launch a openssl test server (section TLS demo): apps/openssl s_server -cert rsa.crt -key rsa.key -www -tls1_3
  4. launch a openssl test client using bike2l1: apps/openssl s_client -curves bike2l1 -connect localhost:4433. Repeat this step until failure (occurs ~50% of the time).
dstebila commented 5 years ago

Thanks for looking into it Shay!

Shay-Gueron commented 5 years ago

We believe that we found the problem. The short story: it is due to a changed behavior of OpenSSL, between version 1.0.2 to version 1.1.0. Specifically, the function BN_GF2m_mod_inv(a, p) changed its behavior (changes from May 2018).

The detailed explanation: (if you wish to skip the details scroll down to the solution)

  1. Recall that the problem occurs in BIKE2. BIKE2 differs from BIKE1 and BIKE 3 in the following: BIKE2 achieves a shorter public key at the expense of polynomial inversion.
  2. Inversion is taken modulo x^r+1 for a prime r such that (x^r+1)/(x+1) is irreducible (arithmetic over GF[2]).
  3. In this polynomial ring, a polynomial is invertible if and only if its weight (= the number of nonzero coefficients) is odd. For example (x^2+x+1) * (x^4+x^2+x) = 1 mod x^5+1, so (x^2+x+1) is invertible (with weight 3) and (x+1) is not invertible.
  4. BN_GF2m_mod_inv(a, p) presumably computes a^(-1) mod p for polynomials a, b and returns an error when no such inverse exists (or found). It used to do the computations with GCD, but changed, in version 1.1.0, to run some trick in order to be (semi) constant time. Now it does the following: 1) choose a random b; 2) compute c = a*b; 3) invert c); 4) multiply by b.
  5. Unfortunately, this does not work in the general case, where p is not necessarily irreducible. In fact, for the case of BIKE2, multiplying an “a” (odd weight) by a random b, might flip the parity of the weight of the product ab (it happens when the weight of b is even, and this is ~50% of the times). Thus, a definitely invertible input (a) can be turned into a non-invertible polynomial (ab) and the “new” function BN_GF2m_mod_inv(a, p) returns an error.
  6. This does not seem to be a legitimate behavior of BN_GF2m_mod_inv(a, p), because, e.g., feeding the polynomial “1” and p = x^5+1 should return “1” but instead, it will now return an error 50% of the time. Also, it is not clear what “error” means: is it “no inverse exists” (which is false) or “the new functionality failed” (and then why)? It could be argued that BN_GF2m_mod_inv(a, p) is meant to receive only irreducible p’s. Still, we expected the “BN” features of OpenSSL to be “dependable”, which turned out not to be the case for our case. That is, inverting modulo x^r+1 should work when there exists an inverse (as it did in OpenSSL ver. 1.0.1).
  7. Perhaps it would have been more prudent to leave BN_GF2m_mod_inv(a, p) unchanged, and add a new function BN_GF2m_mod_inv_irreducible(a, p)with the new (constant time) behavior. Or, at least clearly document the functionality and assumption of BN_GF2m_mod_inv. We will inform OpenSSL developers and let them decide what to do with this.
  8. This analysis explains why the problem appeared “suddenly” (Linux default might still use OpenSSL 1.0.2 and OQS was tested on this version; the 1.1.0 OepnSSL version changed this…), why BIKE2, and why we see the “50%” failures.
  9. As a demonstration this snippet will invert x^2+x+1 modulo x^5+1 on 1.0.2, but fail with significant probability on 1.1.0

include "stdio.h"

include "openssl/bn.h"

void main() { BIGNUM a, m, r; a = BN_new(); r = BN_new(); m = BN_new(); BN_CTX bn_ctx = BN_CTX_new(); BN_hex2bn(&m, "21");

BN_hex2bn(&a, "1");

printf("hhh %d\n",BN_GF2m_mod_inv(r, a, m, bn_ctx)); printf("m: %s\n", BN_bn2hex(m)); printf("a: %s\n", BN_bn2hex(a)); return; }

What to do now? We added a quick fix, and will commit it soon. Basically,

  1. Call BN_GF2m_mod_inv(a, p)
  2. Check the error status (with an “if” statement).
  3. If "error" is returned, call BN_GF2m_mod_inv(a, p) again.
  4. Continue until the inverse is (successfully) computed.

The expected number of times until it succeeds is 2, i.e., we do not expect noticeable timing impact here. Also, we do not care about timing (due to the "if" statements) nor about exposing the number of attempts, because this is a property of the randomized b, thus does not reveal anything confidential.

In the long run, it might be neater to factor x^r+1 = (x+1) * ( x^{r-1} + x^{r-2} + … + x + 1) where the latter polynomial is irreducible. Then use the new BN_GF2m_mod_inv(a, p) function with that polynomial, and subsequently use the CRT. Alternatively, we could simply write the inversion directly and BN_GF2m_mod_inv(a, p) altogether.

For now, the quick fix is fine.

Nir and Shay

dstebila commented 5 years ago

Resolved by open-quantum-safe/liboqs#430 and open-quantum-safe/liboqs#431.