quicwg / base-drafts

Internet-Drafts that make up the base QUIC specification
https://quicwg.org
1.63k stars 204 forks source link

Stream Flow Control on Stream 0 #1252

Closed martinduke closed 6 years ago

martinduke commented 6 years ago

Perhaps all this will be obviated by ekr, etc. work on the "Crypto Stream" PR, but here goes anyway...

It bothers me that Stream 0 is not subject to stream flow control limits throughout the handshake. Subjecting it to connection flow control could absolutely have unintentional consequences, but explicit flow control for the crypto handshake is no harder than what we manage with TCP every day.

One can imagine a fairly trivial attack where a client, after getting a request for a certificate, delivers stream 0 offsets N+10 to N+UINT64_MAX, and the server is required to store it all. It is certainly possible to write some "don't accept stupid stuff" code in the server, but it would be way better to allow this control within the protocol without sketchy heuristics.

Obviously, the Initial packet would still not be subject to these limits.

I would like this tweak to appear in the new proposal. If it is worthwhile to submit a PR against the current text, I am happy to do so.

MikeBishop commented 6 years ago

The reason we don't employ stream-level flow control until the handshake completes is because you can't authenticate the flow control messages you received. This means you have to land in one of three camps:

  1. Can't grant additional flow control credit until handshake finishes. But then if the handshake needs more flow control, it deadlocks.
  2. 1-RTT flow control messages overwrite 0-RTT flow control messages. But if you never send a flow control update under 1-RTT keys, the attacker-controlled value stands.
  3. Normalize the stream flow control after the handshake completes. Closing the stream is a special case of this.

We wound up with (1), and the flow control exemption to avert the deadlock. It's not the only reasonable solution, but it's a more nuanced problem than it first appears.

martinduke commented 6 years ago

@MikeBishop Thanks for the explanation.

When we look at possible harm to a fully compliant implementation, I don't think the current text holds up well: as described above, there is no limit to the data a server must store if it requests a certificate.

The attacker you suggest is annoying, but it's a nuisance rather than fatal: 1) If the attacker tries to lower the amount of credit the client has, the client will ignore it. 2) If the attacker tries to increase the credit, it's still limited by data the client has to send. Moreover, the server will definitely drop the out of bounds data. (In yet another instance of the "stream 0 is special" list, we'd better not set FLOW_CONTROL_ERROR in this case. I'm sure how we'd signal that some stuff was not received.)

But yes, I see the need for some correction mechanism once keys are established.

nibanks commented 6 years ago

I think I agree with @martinduke here. Why not allow the flow control frames to be sent during the handshake, and then recommend (require?) that each side send an updated flow control frame in 1-RTT after the handshake completes? Then, if things don't add up, we either self correct or kill the connection.

huitema commented 6 years ago

Be careful what you ask for, because ill-behave clients could use flow controls and force servers to stall, leading to a DOS attack.

martinduke commented 6 years ago

@huitema That's true, but this is a general problem with flow control no matter what we do with the handshake. In other words, a production server implementation will have to kill stalled connections regardless, stream 0 or not. The current stream 0 language introduces a new attack that is not possible on other streams.

huitema commented 6 years ago

@martinduke sure, but during the handshake the bulk of stream zero data is sent from server to client. The Client Initial can carry at most 1200 bytes, and the second client flight is typically only a few hundred bytes. The only purpose of flow control would be to limit the amount of data sent in the server first flight -- and we already have text about only sending three packets before requesting a path check. I really wonder where the attack might be.

martinduke commented 6 years ago

As I said, the issue only applies that servers that require client certificates. I don't see how the 3-packet limit is relevant to that attack at all.

huitema commented 6 years ago

So the attack happens when a server requires a client cert, and the client sends back something humongous. Not sure that flow control is the right tool there. Feels more like "don't send me a certificate larger than X".

Also, nasty clients could use certificate compression in interesting ways, despite the flow control limit.

martinduke commented 6 years ago

I assume that TLS has some defenses against garbage cert chains - or more precisely, this is not a problem for QUIC. The specific issue is when offsets are deliberately missing -- then there's just a bunch of junk sitting in the stream 0 receive buffer.

siyengar commented 6 years ago

I think crypto stream is actually special in this case and doesn’t need flow control. Flow control only makes sense in 2 cases

  1. The transport is going to buffer something and the application will choose not to read it
  2. The sender can make a different decision based on the flow control

Tls messages are sent in records which are 16kb large. As soo. As the transport receives a message on the crypto stream it is going to read it out with the highest priority and supply it to the tls layer. This means that buffering is done at the tls later and not at the transport later. Also the tls layer only buffers 16k at a time and not all the data. Another layer of buffering happens at the asn parser for certificates. If this layer gets too big tls implementations can just close the connection unilaterally. The tls layer has a much better idea of how much is buffered than the transport layer so the transport layer should not do flow control.

Secondly what is the receiver going to do if it has 1 cert but it cannot send it? Will it choose a different smaller cert dynamically? This would be bad because it’s not based on authenticated information. I would rate this as analogous choosing the cert based on sigalgs if the sigalgs were not authenticated. So if the sender cannot make a different choice dynamically, the user of the api has to analyze the data and make a config change later.

I’m inclined to say that each side maintains their own private limit which is reasonably large and they can change it, and it you’re sending uber large Certs, don’t do that.

We could possibly have the tls layer incidicate some informational aspect of what it’s buffereing limit is to the peer but not need to track it for sending but only for later analysis

siyengar commented 6 years ago

I think for the garbage sitting in stream 0 buffers we can also have a unilateral limit that each side enforces independently

mikkelfj commented 6 years ago

That makes a lot of sense, including the previously longer mail on TLS doing the policing.

How does that relate to crypto after handshake, notably client authentication?

siyengar commented 6 years ago

Post handshake client auth is not allowed with quic iirc. As for other data after the handshake on the handshake stream that is new session ticket has only variable length data which is limited by 2^16 bytes which is small enough to be a constant overhead

martinduke commented 6 years ago

@siyengar I think you missed an important case for flow control: to limit how much has to be stored in the receive buffer until holes in the sequence number space are filled. Indeed, that's the entire point of this issue. In most implementations, buffering at the TLS layer is irrelevant if these sequence number gaps exist.

If a client cert is too large, then a well-behaved server will not provide the credits to deliver the entire thing until the first chunk arrives in its entirety so that it can be assembled and passed up to TLS. This is how things work in TCP, and I don't see that it's a hard problem modulo @MikeBishop's point about spoofed MAX_STREAM_DATA frames.

What is the point of maintaining private, independent limits? This is just flow control without explicit signaling! It can lead to many difficult questions about whether or not to ack such a packet, which implies that the data was processed when it wasn't.

siyengar commented 6 years ago

I addressed that with my second comment. I think that limits can be independent of each other even in that case.

I don’t understand what acking a packet has to do with it. I was suggesting closing the connection with an error code :) if your limit is exceeded

martinduke commented 6 years ago

OK, the independent limit is a hidden tripwire to tear down the connection. Why is that superior to explicitly using the existing flow control message to tell the client not to exceed that?

mikkelfj commented 6 years ago

OK, the independent limit is a hidden tripwire to tear down the connection. Why is that superior to explicitly using the existing flow control message to tell the client not to exceed that?

Stream 0 is not entirely trusted and you can a lot abuse before anything has been negotiated.

martinduke commented 6 years ago

We're going in circles. Please see the third message in this thread.

siyengar commented 6 years ago

Sorry i should have probably been more clear in my last comment. My point was that all these cases are related. A receiver uses flow control to limit the amount of buffering it does without processing the bytes. Let’s assume we did have a way to do dynamic flow control in cleartext. I still believe that there is no use of doing so. In the absence of an attacker there are 2 cases:

  1. Reciever can buffer more bytes than sender has. No problem, flow control doesn’t come in use here
  2. Reciever can buffer less bytes at once than the sender has to send. In this case normally a reciever would use flow control. However in the case of a large cert the receiver is baisclally buffering these bytes just in the tls buffers. At this point flow control was really facetious. So a rational receiver would set the limit to be the limit that the tls layer would be willing to buffer anyway. Ok now that the limit is the tls layers limit, if this limit is less than what the sender wants to send, the sender is screwed anyway regardless of whether there is flow control. Also since the limit is the size of the tls buffer, that is the same limit the quic stream later should enforce for out of order data. At this point there is no use of explicitly signaled flow control

What is really the problem here is that there is really a hard limit that the reciever has that if the sender cannot satisfy there is no point of negotiation. So we have to assume that people set reasonable limits and call it a day

pravb commented 6 years ago

If there are reasonable limits that are not communicated or negotiated, should there be an error code to reset the connection if that limit is exceeded?

I see that the recovery draft talks about a dedicated alarm and retransmissions for handshake data but doesn't explicitly say anything about congestion control for stream 0. Do we need a separate issue for whether congestion control should be applied to stream 0?

pravb commented 6 years ago

I suppose it is implicit that congestion control is applied to all streams since it is at the packet level, but we have special cased handshake data.

martinduke commented 6 years ago

@siyengar The specific case I am worried about is when there are missing bytes in the transport and the TLS buffers don't come into play. According to the spec, there is no bound as to what the transport must buffer on stream 0.

I agree that QUIC flow control is irrelevant when we're talking about TLS buffers.

Allowing flow control during the handshake would allow the server to limit its memory usage until the client fills in the missing bytes.

siyengar commented 6 years ago

@martinduke What I'm suggesting is that even if there is flow control, a rational receiver will never set flow control < tls buffer size at which point it doesn't really matter whether there are missing bytes in the transport buffer out of order or in order, which means that flow control is not useful.

RyanTheOptimist commented 6 years ago

I agree with martin duke. Instead of an implicit limit, we should have explicit limits. If there is no flow control at the QUIC layer, then a malicious peer will be able to consume unlimited memory in the receive buffer by sending all but the first byte on stream 0. This is clearly a non-starter so an endpoint is definitely going to impose a limit. I can't see any upside to having an implicit limit here, and only downsides.

siyengar commented 6 years ago

@RyanAtGoogle don't understand what you mean, the sender would set a limit. Setting a limit below the limit that tls imposes is useless. I think communicating that limit is fine, but flow control is the wrong tool for that job because the limit is not an independent limit but due to the reasoning I laid out it's facetious to communicate any limit < tls limit and also once the bytes are consumed only the tls layer actually knows what the limit is so the quic layer cannot reasonably update that limit

RyanTheOptimist commented 6 years ago

Not imposing a limit smaller than TLS seems reasonable. But that does not seem to me to be an argument AGAINST flow control.

siyengar commented 6 years ago

We don't need flow control to impose a limit >= TLS. If the TLS layer legitimately needed more bytes to complete, an implementation would increase the "flow control" limit to match it even for out of order bytes, at which point what use is flow control?

In the case where bytes arrived in order, flow control is giving us a facetious sense of limit, because if TLS is actually buffering the bytes and then we would advertise a larger number of bytes which potentially could go beyond the bounds of what TLS would accept.

My point is that flow control in the sense of the way it is defined for other streams is inappropriate for the crypto stream.

martinduke commented 6 years ago

Hi Subodh,

I can see from this thread that we are talking past each other. Someone (probably me) is being dense about what to worry and not worry about. If Ekr doesn't obviate the whole discussion with crypto streams, let's talk about this face-to-face in Stockholm.

siyengar commented 6 years ago

sounds great, @martinduke