quinn-rs / quinn

Async-friendly QUIC implementation in Rust
Apache License 2.0
3.57k stars 364 forks source link

Accept connections with unrecognized address validation tokens #1790

Closed Ralith closed 2 months ago

Ralith commented 3 months ago

If we share an identity with a different QUIC implementation, e.g. across an update or through a load balancer, and retries are enabled, we might receive address validation tokens generated by it and fail to decode them. Restarting address validation with a fresh Retry packet allows the connection attempt to proceed.

This involves refactoring of the Retry-mediated address validation cryptosystem, for which particularly careful review is required. In particular, we ditch the random plaintext in favor of directly encrypting the client address, and instead use the (server-chosen) retry source CID to derive a unique token key.

The circumstances where this matters are niche, so no particular rush to get this in before 0.11.

gretchenfrage commented 2 months ago

Hm. Please tell me if I'm missing context, but my understanding is that the spec forbids this.

8.1.2. Address Validation Using Retry Packets

In response to processing an Initial packet containing a token that was provided in a Retry packet, a server cannot send another Retry packet; it can only refuse the connection or permit it to proceed.

If a server receives a client Initial that contains an invalid Retry token but is otherwise valid, it knows the client will not accept another Retry token. The server can discard such a packet and allow the client to time out to detect handshake failure, but that could impose a significant latency penalty on the client. Instead, the server SHOULD immediately close (Section 10.2) the connection with an INVALID_TOKEN error. Note that a server has not established any state for the connection at this point and so does not enter the closing period.

17.2.5.1. Sending a Retry Packet

A client MUST accept and process at most one Retry packet for each connection attempt. After the client has received and processed an Initial or Retry packet from the server, it MUST discard any subsequent Retry packets that it receives.

Also relevant context:

8.1.1. Token Construction

A token sent in a NEW_TOKEN frame or a Retry packet MUST be constructed in a way that allows the server to identify how it was provided to a client. These tokens are carried in the same field but require different handling from servers.

I need to look more into how exactly quinn is handling retries, but I think the appropriate way to facilitate clustering like this would be to make the important part of retry tokens be an encrypted and signed message containing the client's address, and allow the key which encrypts and signs that message to be configured so it can be set the same between different servers in a cluster, so that their retry tokens will be intercompatible

Ralith commented 2 months ago

My incorrect word choice in the PR/commit message may have made this needlessly confusing, and has been revised.

It is true that we must never perform a repeated Retry. However, we may receive address validation tokens that the peer obtained from a previous connection's NEW_TOKEN frame, i.e. not a Retry token. If we fail to decode those for any reason, it is still legal to perform a Retry. See 8.1.3:

If the [address validation] token is invalid, then the server SHOULD proceed as if the client did not have a validated address, including potentially sending a Retry packet.

Note: The rationale for treating the client as unvalidated rather than discarding the packet is that the client might have received the token in a previous connection using the NEW_TOKEN frame, and if the server has lost state, it might be unable to validate the token at all, leading to connection failure if the packet is discarded.

gretchenfrage commented 2 months ago

Hmmmm. Ah, ok, I guess that, in the phrase "If a server receives a client Initial that contains an invalid Retry token but is otherwise valid", they mean to imply by the phrase "Retry token" that it is an address validation token which is valid enough to be unambiguously identified as having been minted as for a retry rather than a new_token frame but which is otherwise in some way invalid? That makes sense, although in my opinion the RFC's language is a bit confusing there. Sounds good though.

Ralith commented 2 months ago

Yeah, this came up as an ambiguous case in the implementers' slack. Perhaps we'll get an errata for it.

Ralith commented 2 months ago

Rebased, and dropped the "factor out token decoding" commit since handle_first_packet is much cleaner after #1752.

Ralith commented 2 months ago

MSRV failure is not related.

djc commented 2 months ago

Disabled the flag that requires all status checks to pass before merging for a bit.