Potential oracle access to the peer's secure randomness

frochet commented 3 years ago

Dear QUIC designers,

I've been recently following the discussion on PATH_CHALLENGE/PATH_RESPONSE attack scenarios, and for what it matters, I am also working on QUIC-like capabilities above TCP. In my design, upon thinking on connection migration attack vectors, I've been worried about another type of exploit. It is more critical in my opinion than bandwidth amplification, and it also affects QUIC.

So, regarding QUIC, when a connection migration is initiated, from the current specification[0], "An endpoint can migrate a connection to a new local address by sending packets containing non-probing frames from that address." Sending a non-probing packet would trigger a PATH_CHALLENGE message from the other peer in an implementation following those specs to enter in path validation.

Essentially, it means that a client can, on-demand, trigger a PATH_CHALLENGE from the server. In the specification[1], the PATH_CHALLENGE is containing an unpredictable payload. More formally, it is meant to be indistinguishable from the output of a random function. In practice, there are different ways to compute this "unpredictable" payload, with critical differences.

The naive approach is to use unsecure randomness random(), and then putting it in the PATH_CHALLENGE.

Another approach would be to use secure randomness to satisfy the current specifications. That is, putting the output of SecureRandom() to the PATH_CHALLENGE.

Ironically, the second approach is worrying me. That is, the current protocol specifications may provide incentives to implement a protocol channel to allow any client to query the server's PRNG output on-demand, and with probably a consequent amount of randomness bandwidth.

We know from history that this is not a great idea, especially is the presence of potential weaknesses within the PRNG (intentional or not). The most appealing example is the backdoored Dual EC PRNG, for which the "backdoor holder" would only need to bruteforce 16 bits to recover the internal state and predict any new outcome. This is possible as soon as enough of the Duac EC output is observed. In the case of Dual EC, and assuming that the PATH_CHALLENGE contains 16 bytes of randomness, the attacker would only need to trigger 3 PATH_CHALLENGE to predict the PRNG's next outcomes (assuming P256 base points).

Of course, Dual EC is an extreme case, for which the randomness could still be available from other protocol materials (e.g., ConnIDs or keys), but we should anyway avoid specifying a protocol that would offer Secure Randomness easily, and not under the peer's control.

I suggest to specify what "unpredictability" means and how to compute it and iterate over multiple "unpredictable" payloads. A goal would be to make clear that we want as few as possible outputs from SecureRandom() to avoid for e.g., a client to have an oracle access to the Server's PRNG. A solution could be to use a simple Hash function and R, 16 bytes of randomness only known by the server and global to all sessions. Then a PATH_CHALLENGE could contain, with SESSION_SERCRET the shared derived secret of the session:

Challenge_1 = H(SESSION_SECRET || R)

Then, the next one:

Challenge_2 = H(Challenge_1 || R) ... Challenge_i = H(Challenge_i-1 || R)

This maintains unpredictability, assuming a cryptographic hash function and R secret. I guess that other constructions are possible as well, using HKDF for example.

I would also suggest looking into other places for which "unpredictability" is needed, and evaluate whether it could provide good oracle access to the other peer, and if we can be more exigent into the specifications to make sure to avoid those issues.

[0] https://github.com/quicwg/base-drafts/blob/master/draft-ietf-quic-transport.md#initiating-connection-migration-initiating-migration

[1] https://github.com/quicwg/base-drafts/blob/master/draft-ietf-quic-transport.md#initiating-path-validation

chris-wood commented 3 years ago

Interesting! Given that the TLS ServerHello random field is typically 32 bytes sampled from a secure source, is this really a new vector? I would expect the same code to be used to produce the challenge body.

frochet commented 3 years ago

Interesting! Given that the TLS ServerHello random field is typically 32 bytes sampled from a secure source, is this really a new vector? I would expect the same code to be used to produce the challenge body.

Yes, that's a good point! That's the reason why I said we should look into other places for which "unpredictability" is needed. For the 32 bytes random field of the server hello, a similar strategy could be employed to reduce the output of the PRNG.

The protocol security essentially depends on assumptions such as a good PRNG, the underlyng untractable problem behind the asymetric cryptography that enables the symmetric key exchange, and the cryptographic assumptions of the H function used. If we can reduce the PRNG outptut size, and fill what is missing from pseudorandom derivations based on cryptographic assumptions that we should anyway trust, then it seems as a win to me.

I understand the current approach "if PRNG bad, we're doomed, so better assume it good". I am not totally in favor of that mindset though. Weaknesses may appear and disappear in the PRNG used; it is not as static as the other security property we rely on.

martinthomson commented 3 years ago

I have very little sympathy for this viewpoint. As Chris says, we expose randomness that might be sourced from a "secure" PRNG in many other places in the protocol.

There are implementation strategies that can be used to isolate PRNG usage for keys from other uses (OpenSSL employs one such technique) and there are hardening methods, such as the one described in RFC 8937. But if the request is to highlight the risk, then we can do that. It's not like we've been parsimonious with security considerations up until now.

peteroupc commented 3 years ago

I prefer the term "cryptographic RNG" rather than "cryptographically secure PRNG" or "CSPRNG" (which suggests that only certain kinds of pseudorandom generators are allowed). For instance, I define "cryptographic RNGs" as follows:

A cryptographic RNG generates random bits that behave like independent uniform random bits, such that an outside party has no more than negligible advantage in correctly guessing prior or future unseen output bits of that RNG even after knowing how the RNG works and/or extremely many outputs of the RNG, or prior unseen output bits of that RNG after compromising its security, such as reading its internal state.

frochet commented 3 years ago

There are implementation strategies that can be used to isolate PRNG usage for keys from other uses (OpenSSL employs one such technique) and there are hardening methods, such as the one described in RFC 8937. But if the request is to highlight the risk, then we can do that. It's not like we've been parsimonious with security considerations up until now.

This is exactly what I suggest to discuss in the Security Considerations section. I was unaware of RFC 8937. It seems to cover the same concerns that I have and produce a similar recommendation.

I think it would be valuable to QUIC stacks to pay attention to this issue, and eventually apply RFC 8937 on every source of randomness within QUIC.

quicwg / base-drafts

Potential oracle access to the peer's secure randomness #4314