Looping with multiple Retry packets

martinthomson commented 6 years ago

Allowing multiple Retry packets creates a potential for regression of the address validation tokens.

Say that a client retransmits its first Initial packet without a token. The server responds to both with the same token. The second of these packets is delayed.

The client receives the first, sends another Initial packet and receives a second token in response.

Then the Retry that the server sent in response to the retransmission of the first Initial is received. The client switches to that token as though it were new, but it's back to the first token.

If the server relies on multiple Retry packets and the progressive validation of the address using those tokens, then this will revert any progress that was made. Because Retry can't be sent indefinitely (it has this arbitrary limit of 3 changes), this might cause that connection to fail.

kazuho commented 6 years ago

@mikkelfj:

If you are in a cloud hosted setup (like Azure or Digital Ocean) the servers are operated by customers while load balancers and DDoS mitigation devices might be operated by the cloud provider. Requiring shared secrets here is messy.

Thank you for pointing out that. I must say that that is a compelling argument. It's the use case where a network operator wants to protect a server administered by a different person / entity.

Though I am not sure about the security properties.

nibanks commented 6 years ago

The case that @mikkelfj points out is exactly the case we are trying to solve for Azure. Azure owns the DDoS mitigation devices and load balancers, while the 3rd party owns the QUIC server, which could be from any implementation.

I agree we should look at the security properties and enumerate all the threats/attacks this design could expose. Then it's a matter of weighting the impact of those threats vs the cost/complexity if we decided to fix them. Personally, I haven't seen an attack that would really benefit a middle box any more than any other handshake disruption tactic.

MikeBishop commented 6 years ago

While I can't claim it's a committed product plan to do exactly this, Akamai has a DDoS mitigation product that I could envision working this way when facing QUIC traffic. I don't know what the token format would look like in this case.

kazuho commented 6 years ago

@nibanks @MikeBishop It's good to know that we have interest in such deployments. Thank you.

I agree we should look at the security properties and enumerate all the threats/attacks this design could expose. Then it's a matter of weighting the impact of those threats vs the cost/complexity if we decided to fix them. Personally, I haven't seen an attack that would really benefit a middle box any more than any other handshake disruption tactic.

The issue about simply allowing the existence of an uncoordinated middlebox is that it becomes impossible for any server to detect somebody on-path altering the handshake traffic.

For example, a middlebox can alter the server CID by sending a Retry, and the server will not notice the alternation if the middlebox also drops the token field of the 2nd Initial packet sent from the client that traverses through the middlebox to the server.

While I understand that you cannot care about the issue in the deployments that you are interested in, I think that others would be worried about the possible impact on security as well as the ossification concern including the one that I have described in https://github.com/quicwg/base-drafts/issues/1451#issuecomment-398982420.

Fortunately, there are ways to define a signal for detecting tampering that can be implemented by server operators who will not have uncoordinated DOS detection devices.

One way is to add an "Original_DCID" field to Transport Parameters, and state that "a server SHOULD check that the value of the Original_DCID field matches that of the packet that it saw in the first packet that belonged to the connection". Servers running behind an uncoordinated middlebox will turn this check off.

Note that having a configuration knob is mandatory for servers running behind such a middlebox, even if we do not introduce the "Original_DCID" field. This is because Retry is version-specific (which means that uncoordinated DOS mitigation devices might need to send a Version Negotiation packet). To support that, the servers need to have a knob that changes how the downgrade protection logic works (FWIW, end-to-end version downgrade protection is currently a MUST; we need to change it as well to allow the existence of uncoordinated DOS mitigation devices).

martinthomson commented 6 years ago

Closed by #1498.

quicwg / base-drafts

Looping with multiple Retry packets #1451