quicwg / base-drafts

Internet-Drafts that make up the base QUIC specification
https://quicwg.org
1.63k stars 204 forks source link

Probing and connection migration #880

Closed janaiyengar closed 6 years ago

janaiyengar commented 6 years ago

161 and #732 assume a make after break model, where only one network is available at any given time for an endpoint to migrate to. This is generally too simplistic, and it is common for mobile devices to have access to both WiFi and cellular. Connection migration needs to be more nuanced about probing and using the multiple available networks. Experience with GQUIC and MPTCP shows that having access to multiple networks is almost necessary for useful connection migration.

We should design migration carefully based on known best practices and experience. The current text is necessary but insufficient, and this issue is a placeholder for improving this set of mechanisms in the draft.

janaiyengar commented 6 years ago

Tommy Pauly, Eric Kinnear, and I worked through a design of connection migration that we have now written up in PR #1012. This PR describes a more complete connection migration design. Among other small bits, this design includes the following:

The new frames obviate the need for PING with data, so PING is now back to being what it used to be -- a frame with no data -- and PONG is gone as well. Note that this allows for a clean implementation and semantic separation. PING can be considered a "retransmittable" frame, with retransmission when not acknowledged, whereas PATH_CHALLENGE/RESPONSE are less onerous, since an implementation may choose to not retransmit (or limit retransmission attempts differently on) probing frames sent to alternate paths.

Please keep discussion of this PR (modulo clarifications) on this issue, and not on the PR.

pravb commented 6 years ago
  1. Do we restrict path challenge / response to only one additional path at a time?
  2. Let's make it clear that there MUST/SHOULD be one congestion control state for both paths in use. I’d rather err on the side of being conservative about cc until we define multipathing.
  3. Let's try to define the "some number of PATH_CHALLENGE frames and/or after some time" One of the challenges with TCP is the variable behavior in different implementations. Also, otherwise the numbers here can aid fingerprinting the implementation.
  4. "the server SHOULD NOT send more than an initial window's worth of bytes per estimated roundtrip period". Since the server should not send data until it gets a valid path_response, is this limit for just path_challenge packets?
  5. Let's require both client and server to reset the cc and RTT state upon switching to the new address. Currently the text requires only the server to do so. Let's also mark this a MUST or add sufficient warning text that deviating from this may cause excessive loss on the network due to using a mismatched inflated cwnd.
janaiyengar commented 6 years ago

Thanks @pravb. Answers below:

  1. Not quite. A client can probe multiple paths at the same time, and a server can be validating multiple paths at the same time, but the server considers only one path the one it's committed to. Happy to clarify that further in the PR if you think that's useful.

  2. I would like to suggest that a single congestion controller is enough, but I would like to allow an impl to use multiple if it really wants to. That said, we don't specify how to use multiple controllers, so I'm fine with a SHOULD.

  3. Agreed. I'll come up with something here.

  4. The server can send data until validation is complete, which is why this limit of IW is applied. The limit is to basically not allow an arbitrary amount of data from being dumped on a victim. We can tighten this to lower than IW, but our reasoning was that 0RTT gives you the same amount of amplification, and this is similar.

  5. Agreed. Will come up with something here.

pravb commented 6 years ago

Re 0. Yes let's clarify, it wasn't immediately clear in the text. What is the use case we are targeting with allowing more than one path being probed at a time? This also becomes a possible attack vector because client can now generate challenges from thousands of addresses with same connection ID. This leads to costlier lookups depending on the implementation and the server needs to keep state for each new 4-tuple to be mapped to this connection ID? I would try to make connection migration simpler not more complex in v1.

MikeBishop commented 6 years ago

There are some nits in the text, but I see a couple larger design decisions we need to consider here on the issue:

janaiyengar commented 6 years ago

@MikeBishop : Thanks for the feedback. I think I might redo the PR as symmetric endpoints instead of server/client, which might resolve some issues... we did that earlier because it was easier to reason about in the earlier designs, but the current version actually lends itself nicely to symmetry, I think. At any rate, I'll give it a shot.

Separately, PMTU probing is kinda necessary before sending on the path -- we don't know what the path's MTU is. This is equivalent to the first client packet and server packets being padded to PMTU size.

I deliberately did not mention server address changes here because I did not want to conflate these two features and make this PR even bigger. It's not a big enough delta to be put off until v2 and it is super useful, but I figured I'd do that after getting client migration in. We do have #560 to track the server address change in the meanwhile.