janaiyengar commented 6 years ago

161 and #732 assume a make after break model, where only one network is available at any given time for an endpoint to migrate to. This is generally too simplistic, and it is common for mobile devices to have access to both WiFi and cellular. Connection migration needs to be more nuanced about probing and using the multiple available networks. Experience with GQUIC and MPTCP shows that having access to multiple networks is almost necessary for useful connection migration.

We should design migration carefully based on known best practices and experience. The current text is necessary but insufficient, and this issue is a placeholder for improving this set of mechanisms in the draft.

janaiyengar commented 6 years ago

Tommy Pauly, Eric Kinnear, and I worked through a design of connection migration that we have now written up in PR #1012. This PR describes a more complete connection migration design. Among other small bits, this design includes the following:

Allow a client to probe alternate paths without having to use them for data. The PR introduces PATH_CHALLENGE and PATH_RESPONSE frames for this purpose.
Allow a client to do PMTU verification on a new path.
Ensure that a server validates client ownership of a new address. The server also uses the PATH_CHALLENGE and PATH_RESPONSE frames for this purpose.

The new frames obviate the need for PING with data, so PING is now back to being what it used to be -- a frame with no data -- and PONG is gone as well. Note that this allows for a clean implementation and semantic separation. PING can be considered a "retransmittable" frame, with retransmission when not acknowledged, whereas PATH_CHALLENGE/RESPONSE are less onerous, since an implementation may choose to not retransmit (or limit retransmission attempts differently on) probing frames sent to alternate paths.

Please keep discussion of this PR (modulo clarifications) on this issue, and not on the PR.

pravb commented 6 years ago

Do we restrict path challenge / response to only one additional path at a time?
Let's make it clear that there MUST/SHOULD be one congestion control state for both paths in use. I’d rather err on the side of being conservative about cc until we define multipathing.
Let's try to define the "some number of PATH_CHALLENGE frames and/or after some time" One of the challenges with TCP is the variable behavior in different implementations. Also, otherwise the numbers here can aid fingerprinting the implementation.
"the server SHOULD NOT send more than an initial window's worth of bytes per estimated roundtrip period". Since the server should not send data until it gets a valid path_response, is this limit for just path_challenge packets?
Let's require both client and server to reset the cc and RTT state upon switching to the new address. Currently the text requires only the server to do so. Let's also mark this a MUST or add sufficient warning text that deviating from this may cause excessive loss on the network due to using a mismatched inflated cwnd.

janaiyengar commented 6 years ago

Thanks @pravb. Answers below:

Not quite. A client can probe multiple paths at the same time, and a server can be validating multiple paths at the same time, but the server considers only one path the one it's committed to. Happy to clarify that further in the PR if you think that's useful.
I would like to suggest that a single congestion controller is enough, but I would like to allow an impl to use multiple if it really wants to. That said, we don't specify how to use multiple controllers, so I'm fine with a SHOULD.
Agreed. I'll come up with something here.
The server can send data until validation is complete, which is why this limit of IW is applied. The limit is to basically not allow an arbitrary amount of data from being dumped on a victim. We can tighten this to lower than IW, but our reasoning was that 0RTT gives you the same amount of amplification, and this is similar.
Agreed. Will come up with something here.

pravb commented 6 years ago

Re 0. Yes let's clarify, it wasn't immediately clear in the text. What is the use case we are targeting with allowing more than one path being probed at a time? This also becomes a possible attack vector because client can now generate challenges from thousands of addresses with same connection ID. This leads to costlier lookups depending on the implementation and the server needs to keep state for each new 4-tuple to be mapped to this connection ID? I would try to make connection migration simpler not more complex in v1.

MikeBishop commented 6 years ago

There are some nits in the text, but I see a couple larger design decisions we need to consider here on the issue:

PMTU probing is getting conflated with validating the path. I'd suggest that the path be validated with packets of QUIC's minimum PMTU (1200, yes?), and client and server can probe for a higher PMTU after transitioning.
You can make the logic of validating / responding to validation generic from client to server and just talk about in what circumstances you would validate.
There's no mention of server address changes here. Whether it's a server handing off from an anycast handshake to a unicast address for the rest of the session, or it's a peer-to-peer HTTP connection, we should have a plan (even if that plan is to wait for v2). It seems like you'd need one additional frame, PATH_SUGGEST, which tells the other party it should attempt to validate the supplied address (since a server couldn't get traffic through a NAT if it just started sending packets from its new address).

janaiyengar commented 6 years ago

@MikeBishop : Thanks for the feedback. I think I might redo the PR as symmetric endpoints instead of server/client, which might resolve some issues... we did that earlier because it was easier to reason about in the earlier designs, but the current version actually lends itself nicely to symmetry, I think. At any rate, I'll give it a shot.

Separately, PMTU probing is kinda necessary before sending on the path -- we don't know what the path's MTU is. This is equivalent to the first client packet and server packets being padded to PMTU size.

I deliberately did not mention server address changes here because I did not want to conflate these two features and make this PR even bigger. It's not a big enough delta to be put off until v2 and it is super useful, but I figured I'd do that after getting client migration in. We do have #560 to track the server address change in the meanwhile.

quicwg / base-drafts

Probing and connection migration #880