Connection Migration Failure when Path MTU is asymmetric

marten-seemann commented 4 years ago

This issue is somewhat similar to the issue with asymmetric path MTUs that @kazuho discovered in #4183.

Consider the following situation: Client and server are communication on one path. The client now probes a new path by sending a PATH_CHALLENGE frame on the new path. The server responds with the PATH_RESPONSE frame, but sends it on the old path. The client now considers the new path validated, and switches over to that path. If the PMTU in the direction from server to client is now smaller than 1200 bytes, it will never receive a packet from the server and the connection will time out. Even worse, if only small packets (for example ACK-only packets) make it through, this might delay the idle timeout indefinitely.

While we could require the client to pad the packet containing the PATH_CHALLENGE to 1200 bytes, this still doesn't help us making sure that the return path also actually supports this packet size.

kazuho commented 4 years ago

This is another interesting issue.

In a nutshell, post-handshake path validation should confirm that PMTU is no less than 1200 bytes in both directions. But currently we do not require (or recommend) that the probe packets should be padded. Moreover, having such a requirement does not address the problem, because an endpoint is not required to send PATH_RESPONSE on the path on which it received a PATH_CHALLENGE.

gloinul commented 4 years ago

So this issue is similar to #4183 with the added tweak that in migration each peer's verification is independent of each other. So the client will only know that it send path is working as currently specified. Only after the server have performed its PATH_CHALLENGE to the client for the new path will it know that it is actually working.

Looking at how path validation is defined currently, with the addition of requiring padding to 1200 bytes the result for this case is that the new path will not validate for the server as none of the server's path challenges will arrive at the client and be echoed. Thus the server will declare the path as failed after max(3PTO, 6kInitialRtt). So the client will be able to use the new forward path but unless the old server->client path works the QUIC connection will fail. So I think with a PADDING requirement the slow or indefinite death of the connection will not occur.

Looking at how connection migration is defined, the client will verify the path and attempt to migrate the connection by sending non-probe traffic. If the server do not send any non-probe traffic on the new interface until it have done path verification with 1200 bytes the connection migration will fail and the client will never see the server switch to using the new client source address for its communication.

Thus, I think the only thing to make this fail consistently is to require padding of the path probes. Possibly one need to clarify that if one sends ACKs outside of path probes that are padded then one can end up in this state if the return path is not MTU capable.

ianswett commented 4 years ago

I don't believe this is a PathMTU specific issue. The path could work perfectly in one direction and not in the opposite direction if the PATH_RESPONSE is sent on the old path.

I thought this was a known issue for those implementing connection migration? @ekinnear

larseggert commented 4 years ago

I never understood why a PATH_RESPONSE was OK to send on the old path. Given that the client clearly wants the server to start using the new path (= a different IP address and port) it seems like the server has a really strong motivation to check that the client can actually receive stuff it sends there.

gloinul commented 4 years ago

@ianswett agree there are only one reason. But the MTU can be verified to avoid failure here.

@larseggert I think sending it back on the return path would avoid some failure cases. However, the peer endpoint still needs to do its own path verification. So do you think one should pad the PATH_RESPONSE carrying QUIC packet to verify the MTU on the return path for the endpoint requesting the path validation?

larseggert commented 4 years ago

Why would the peer need to do another path validation? It is still sending to the same endpoint.

nibanks commented 4 years ago

I agree that it seems all PATH_CHALLENGE packets should be padded. I also cannot remember why it was allowed to respond with a PATH_RESPONSE on a different path than the one challenged. I assume because path challenge/response is just treated as a unidirectional validation mechanism.

One problem with padding PATH_CHALLENGE packets is in the case of rebinding. If the first/only packet you receive after rebinding is small enough, amplification protection would prevent you from sending a fully padded challenge packet back in response.

gloinul commented 4 years ago

@larseggert because otherwise a malicious client can fake it. So if the client sends the server a PATH_CHALLENGE to probe the path, it can spoof the source address, thus the server can't rely on it. So therefore the Server needs to send its PATH_CHALLENGE and get a matching response before it has validated that path for transmission. So that requirement will not go away.

larseggert commented 4 years ago

The migrating endpoint doesn't send a PATH_CHALLENGE, it just sends data (or whatever) on the new path. The other end sends the challenge, and the migrating endpoints sends the response. (In the case of a NAT, the migrating endpoint may not even know it is migrating, because a NAT rebinding might cause the path validation.)

gloinul commented 4 years ago

@lars that is not the only way. Your way will occur when one have a NAT rebinding for example, but a client can test a path prior to migrating to it also. So apparently we are considering two different cases. And in your case the client to server path will not be probed by the client. Thus it become a question if the client needs to probe with a 1200+ packet or can rely on the ACKing of the servers PATH_RESPONSE padded to 1200+ for MTU verification?

larseggert commented 4 years ago

The test-before-migrate is a MAY (Section 9.1) and Section 9.2 is pretty clear that a migration is initiated by sending non-probing frames and that a migrating endpoint can do that without first needing to validate the path.

gloinul commented 4 years ago

I will note that the client is expected to a path validation with the server when it acts on the server preferred address: Section 9.6.1: https://quicwg.org/base-drafts/draft-ietf-quic-transport.html#name-communicating-a-preferred-a

A client might need to perform a connection migration before it has migrated to the server's preferred address. In this case, the client SHOULD perform path validation to both the original and preferred server address from the client's new address concurrently.

So that case exist also here. The point is that we actually have two cases and both needs to be considered when discussing this.

larseggert commented 4 years ago

Yes, that migration to the preferred address trips me up, because it reverses the path validation. I don't actually know if anyone has implemented this at all yet.

martinthomson commented 4 years ago

I think that this is remedied in the same way as we have elsewhere: we can recommend that endpoints pad probing packets to at least 1200 bytes. If any packet containing PATH_CHALLENGE is never less than 1200 bytes, the path will never be valid without the MTU also being sufficient.

Rebinding will allow moving to path that is not validated, but I expect that not to be a significant risk. It seems equally likely that the path MTU would spontaneously reduce.

And then there is the period where an unvalidated path is used, separate from the rebinding case. Nothing much to be done for that, but there is a firm limit on that, because a path is abandoned when validation fails.

kazuho commented 4 years ago

@martinthomson I'm not sure if I agree. I would write down the steps that I think would lead to a problem even when both endpoints pad their probe packets. Would you mind pointing out where I'm wrong?

step 1: client sends a full-sized PATH_CHALLENGE on a new path step 2: server receives that packet, sends a full-sized PATH_RESPONSE on the old path step 3: client receives PATH_RESPONSE, and calls the path as being validated step 4: (for some reason; e.g. client sending non-probe packets) server tries to validate the new path, by sending a full-sized PATH_CHALLENGE on the new path step 5: client never receives the PATH_CHALLENGE because PMTU in the server -> client direction was sub-1200. step 6: server fails to validate the path

martinthomson commented 4 years ago

@janaiyengar were just discussing the same failure mode. Note also that the PATH_RESPONSE can be on the old path as well, so recommending padding for that won't work.

I think that the padding idea is helpful, but not completely good. I'll explain more in a longer posting.

martinthomson commented 4 years ago

The reason that padding doesn't work is that the PATH_RESPONSE can be small or follow a different path. At which point the client thinks that the path is good. If the client migrates, the server will attempt to validate that path. If the server pads to 1200, then its validation will fail.

I think that the server switches back to the old path at this point, which might have no effect. The client might have closed the socket by then.

At this point, the client continues in its use of the new, broken path. The server either keeps attempting to validate that new path successfully (at an interval determined by how often it retries path validation), or just gives up. If the server keeps going, you get into a weird state where the connection can stutter along. Either way, the connection is no longer useful.

This is slightly faster than the case where PATH_CHALLENGE is not padded in any way. There, you might end up with an path that appears to be good, until you need to send more data. That might take some time. At least this way you path validation failure within ~3PTO.

Padding is also an improvement in that it detects paths that have an MTU that is uniformly less than 1200.

Without revisiting the underlying design, I think that we just have to accept this limitation. And I guess we should acknowledge it - I'll add some text on that point.

janaiyengar commented 4 years ago

I agree. I can't think of a way around the limitations here without redesigning the core design. I would suggest moving forward with padding PATH_CHALLENGEs and articulate the potential risks here.

Connection migration remains a gift that keeps on giving, and I would expect it to give more as we experiment with it in the field. Let's not lose ourselves in trying to perfect the design of a feature that we are going to learn a lot more about as we deploy QUICv1.

kazuho commented 4 years ago

+1 to not doing a redesign in v1.

That said, I tend to think that it might be possible to fix this issue with small tweaks. Specifically, I wonder if stating following would address the problem:

a sender MUST pad PATH_CHALLENGE
when receiving PATH_CHALLENGE, an endpoint MUST send a padded PATH_RESPONSE using the path on which PATH_CHALLENGE was received
receiving PATH_RESPONSE on any path is valid (this is existing requirement)

The reason we have the last bullet point is to prevent issues caused by MOTS attackers racing packets. We have to keep them. However, that necessary does not mean that the sender of PATH_RESPONSE should be entitled to send PATH_RESPONSE on any path. If we tighten the requirement on the sender of PATH_RESPONSE, I think that the issue might go away, without causing any regression?

janaiyengar commented 4 years ago

@kazuho: I was literally writing exactly this out, modulo that last point, which I think is a very good addition. This is a small enough tweak even to the text that I think we can pull this off with a minor change.

Specifically, the text that will need to change (in addition to requiring padding PATH_RESPONSE) is: “A PATH_RESPONSE frame does not need to be sent on the network path where the PATH_CHALLENGE was received; a PATH_RESPONSE can be sent on any network path.”

We changed that when we changed validation to be unidirectional. We can retain that property of path validation, but still require PATH_RESPONSE to be sent to the same address from which the PATH_CHALLENGE was received.

ianswett commented 4 years ago

I think @kazuho suggestion is sensible and simple enough. I will note that that second MUST is unenforceable given the third point, but I think that's ok.

larseggert commented 4 years ago

Labeling this as "design" per #4241

gloinul commented 4 years ago

Simply note that I think the proposal in #4241 appear to be an okay resolution. I would recommend that people consider also #4188 so they are aligned.

quicwg / base-drafts

Connection Migration Failure when Path MTU is asymmetric #4216