moq-wg / moq-transport

draft-ietf-moq-transport
Other
70 stars 16 forks source link

Trial parsing is not the most efficient #474

Open huitema opened 1 week ago

huitema commented 1 week ago

The MoQ transport messages are encoded as a combination of varints, bit strings, and t-l-v objects. In the absence of a "header length" field, applications have to resort to trial parsing: accumulate some incoming bytes from the stream, try to parse those bytes as a message, and if that fail accumulate more bytes and retry. This may not be a huge problem in normal operation. The peer endpoint will compose a message and typically send those bytes all at once; the QUIC stack will deliver them all at once, but even so there may be cases where the QUIC stack cannot send them in a single STREAM_DATA messages, so the logic for "accumulate and try" has to be present.

Suppose a peer sending its data one byte at a time, because it is either malicious or buggy. The length of the message cannot be computed before all components are parsed: the first byte of each varint defines its length, the length field of bits or bytes strings need to be parsed in full, then number of parameters or versions needs to be known. To obtain the length of a component, all components before it need to be parsed, which means that the number of trial parsing is typically a function of the number of components -- maybe an O(N2) function since the cost of each trial depends on the number of components before it. That number can be very large for messages including parameters, because the number of parameters cannot be known in advance -- the peer could send an arbitrary number of them in an attempt to "greasing". Parsing is a somewhat CPU intensive operation, which implies that trial parsing can be exploited for a mild DOS attack.

Prepend a length field to messages would mitigate trial parsing, but I am not sure that we need to fix this. Having limits on message size would impose a de-facto limit on the processing cost of each message. A malicious attackers could send series of short messages such as "subscribe updates" to trigger the same CPU consumption. In typical QUIC stacks, the main CPU load comes form passing messages through the OS/app boundary in socket calls, and an attacker could DOS the target by sending series of short messages. The next component is the cost of encryption and decryption, compared to which the cost of trial decryption is probably small. We may decide to do nothing, but it would be nice in that case to have a documentation of the reasoning (as in this issue), and maybe a discussion of why the attack does not matter in the security section.

Addressing the "limits" issue (#473) would also somewhat mitigate the problem, as limiting the "number of parameters" appears to be the main way to increase the number of components in messages.

kixelated commented 6 days ago

MoQ's message encoding is similar to QUIC's, but the difference is that QUIC datagrams have a finite size and are not susceptible to a trickle attack.

I've implemented two types of MoQ parsers:

  1. Async state machine.
  2. Trial parser.

The async parser is easy in some languages (JavaScript, Go). There could be excessive wake ups if the stream is written one byte at a time but I that's a fundamental QUIC problem (userspace wakeup).

My trial parser (Rust) returns a More(min_bytes_needed) error to help mitigate this attack. You can return the remaining size (ex. string/VarInt) in addition to the minimum size of any remaining fields (usually 1 byte).

You could implement a state machine manually too, but a trial parser will be fine if you don't allocate on the heap.

I'm on the fence if we should length prefix or not. I've implemented both and think streaming is easier in general across multiple languages, but I do think this decoding gotcha is real.

huitema commented 6 days ago

I wrote a trial parser in C that also returns the min number of additional bytes needed, like your Rust parser, but there can be a big difference between the min number and the max number: for a varint, 1 byte versus 8 bytes; for a TLV, 2 bytes for 2 varints versus 8+8+length. That's how I came to the estimate O(number_of_components^2).

I am also on the fence, as I said in the top post. The simplest mitigation would be to limit the number of parameters allowed in Setup or Subscribe parameters. Currently, the lists contain one or two values, but given greasing the max is effectively unlimited.

LPardue commented 6 days ago

This sounds like what HTTP/3 frame and stream parsing already has to deal with, or am I overlooking something unique to MoQT?

suhasHere commented 6 days ago

I would like to put a limit on number of parameters per version. For the draft-04 it is 2. If in the future drafts we add more, we can increase the max as needed. Any one using more than max for a given version will result in protocol error.

afrind commented 6 days ago

This sounds like what HTTP/3 frame and stream parsing already has to deal with, or am I overlooking something unique to MoQT?

H3 has an upfront frame length, so you can avoid dribbling byte/reparsing attacks

LPardue commented 6 days ago

OK, I think this is a real problem then. +1 to fixing it