tlswg / tls13-spec

TLS 1.3 Specification
562 stars 157 forks source link

Require checking CT in encrypted data #1145

Closed ekr closed 6 years ago

ekr commented 6 years ago

We don't require checking the TLSCipherText version and we don't say anything about the CT.

kaduk commented 6 years ago

I'm not 100% sure I understand the intent, but I do remember some previous discussions where the conclusion seemed to be that

ProtocolVersion legacy_record_version = 0x0303; /* TLS v1.2 */

implies that it is permitted for a peer receiving the structure to enforce that value, but not required for it to do so. And that there were valid reasons why different implementors might want to go one way or the other.

davidben commented 6 years ago

Enforcing it on the content type might be worth requiring since it's somewhat semantically meaningful, when doing the trial "decryption" step for 0-RTT reject with HRR.

cjpatton commented 6 years ago

The need to enforce checking the values of _opaquetype and _legacy_recordversion arises in a stronger-than-usual security model [1]. Allow me to explain.

Let's call the ciphertext stream the sequence of bytes read from the socket. (So we're abstracting the transport layer protocol.). TLS specifies a mechanism for parsing the ciphertext stream into a sequence of ciphertexts. Specifically, each ciphertext is delimited by a string (23, 0x0303, length), where length is the number of bytes of the next ciphertext. The process by which the stream is parsed into discrete ciphertexts is usually ignored by cryptographers; the proofs, say, for AES-GCM, say nothing about how this process affects security of the protocol that uses it. Recent papers (e.g. [1,2] ) have sought to bridge this gap by accounting for parsing of the ciphertext stream in the security model. The adversary controls delivery of the ciphertext stream; It may deliver the stream in pieces (bit by bit if it likes) or out-of-order, or it may attempt to mangle a ciphertext, or just the delimiter bits, inject a ciphertext, etc.

I recently finished an analysis of the TLS 1.3 record layer (draft 23). The paper is in submission, so I'm not sure how much I can say about it without violating anonymity. But one thing we show is that the precise details of how the stream is parsed into discrete ciphertexts is irrelevant for security as long as the delimiter bits are treated as associated data for AEAD encryption/decryption. Otherwise, security crucially depends on the details of this mechanism. Authenticating the _opaquetype, _legacy_recordversion, and length fields would allow for appealing to our general result for the record layer. However, discussions with @ekr have pointed out that this means the length of the plaintext needs to be known before encrypting, and that this may be a problem for some implementations. Since mangling the length bits necessarily leads to different inputs to AEAD decryption (and thus an inauthentic ciphertext), we recommend the following:

However, we also show that, if one leaves the associated data as is (empty), then it suffices to check that the _opaquetype and _legacy_recordversion fields match the values mandated by the spec (23 and 0x0303 respectively). Currently, checking the latter is optional, and the spec is silent about checking the former. This leads to another option:

I'm not sure what the right language is for option (2). Essentially, mangling these fields should be treated as an inauthentic ciphertext.

Barring a compelling reason not to authenticate one or both of these fields, we recommend option (1) being adopted. It makes the record layer more flexible and robust to changes to the spec. This permits a certain amount of "future-proofing" and resilience to "creative" implementations. More fundamentally, this is the security conservative choice, since these fields are, semantically, associated data [3]. However, option (2) should be adopted at a minimum.

  1. A. Boldyreva, J.P. Degabriele, K. Paterson, and M. Stam, 2013. Security of Symmetric Encryption in the Presence of Ciphertext Fragmentation. (link)
  2. M. Fischlin, F. Günther, G.A. Marson, and K. Paterson, 2017. Data Is a Stream: Security of Stream-Based Channels. (link)
  3. P. Rogaway, 2002. Authenticated-Encryption with Associated-Data. (link)
martinthomson commented 6 years ago

I believe that option 2 is in fact fairly common. It saves a little in complexity and allows us to do things to the record header that would be difficult otherwise (see DTLS 1.3 here for the variations on record header).

I look forward to seeing the paper. I'm interested in whether it addresses the problem for DTLS, where failure to remove protection results in discard and not connection termination.

davidben commented 6 years ago

I'm generally in favor of checking things on principle, and our implementation does indeed check, but this security model is puzzling to me. Is the issue that something concretely goes wrong when the value is not checked (but is ignored) or just that your security models and tools have a hard time with ignored fields.

(Not that we shouldn't try to ease verification. Like I said, I'm generally in favor of checking fields anyway. But I wish to understand your motivation clearly.)

kaduk commented 6 years ago

I'd also like to better understand the motivation. Option 2 feels a little more natural to me if we do opt to adopt one of these proposals, but they do seem "equivalent" in an intuitive sense.

cjpatton commented 6 years ago

@martinthomson, the conventional wisdom in this line of research is that the receiver should close the connection as soon as an attack is detected. Loosely speaking, our definition of ciphertext stream integrity demands that as soon as the receiver consumes a record that was not sent by the peer, it should never output a valid plaintext. However, I think that our syntax is flexible enough to account for the packet dropping semantics of DTLS, but we didn't study this. Seems interesting ... I will look into it.

cjpatton commented 6 years ago

@davidben and @kaduk, the attack is, in a sense, an artifact of the model. To define ciphertext stream integrity, one defines the channel as being in sync if, loosely, the sequence of bits consumed by the receiver is a prefix of the sequence of bits produced by the sender. The adversary wins if the channel goes out of sync and the receiver ever outputs a plaintext thereafter. The attack is as follows. Say the sender outputs a string A || Y, where A = (23, 0x303, |Y|), where "||" means string concatenation, and |Y| denotes the length of Y. The adversary transmits A' || Y to the receiver where A' = (22, 0x303, |Y|). Since A' and A differ by a bit, the channel will be deemed out-of-sync, but Y will be decrypted as normal and the receiver will output a plaintext.

Whether this constitutes a "real" attack is debatable; mangling the delimiter bits does not affect decryption, and so doesn't seem to violate integrity or privacy in an intuitive sense. But this is true only if down stream processing of the plaintext is independent of these values. Whether this holds or not depends on the implementation, and since this behavior is out-of-scope of the spec, it seems to me that this is a danger that should be considered.

ekr commented 6 years ago

I'm actually starting to warm to option 1. DTLS, unlike TLS, has diversity of headers, and so an attacker could for instance, switch between the short and long headers. I don't know of an attack on that, but it seems like it would be good to defend against it. Is there a reason other than elegance/minimal change that people would prefer #2?

On Fri, Feb 16, 2018 at 6:38 AM, Christopher Patton < notifications@github.com> wrote:

That being said, it seems to me that folks are in favor of option (2). With your permission, I'd like to put together a pull request and implement this change.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tlswg/tls13-spec/issues/1145#issuecomment-366252443, or mute the thread https://github.com/notifications/unsubscribe-auth/ABD1oa0OnqpK_6ULBZzmgbMVLdwJeEnAks5tVZLKgaJpZM4SBHHj .

davidben commented 6 years ago

I'm basically ambivalent and only talked about (2) since I assumed (1) was already rejected due to the length issue. On a reread, I misread and missed that option (1) specifically excludes the length from there.

Even including the length would not be catastrophic. We don't actually have variable-overhead AEADs we care about, and if we use the actual record header rather than the simulated plaintext record header as in TLS 1.2 (i.e. the length of the ciphertext, not the plaintext), I find it hard to imagine it blocking even a hypothetical future variable-overhead AEAD. Such a scheme presumably would still size the output length purely as a function of input lengths and not the contents.

Though I suppose (2) means BoringSSL doesn't have to do anything because we already enforce those fields... :-)

cjpatton commented 6 years ago

The bigger (potential) issue is if the plaintext length isn't known prior to calling the AEAD encryption function. But perhaps this isn't actually a problem? If it isn't (and barring any other issues), I would favor authenticating all three.

kaduk commented 6 years ago

Is there a reason other than elegance/minimal change that people would prefer #2?

Nope. And I was thinking only of TLS and not DTLS when I formed my opinion...

davidben commented 6 years ago

If you want the plaintext length, that's probably sufficiently a nuisance that I don't think it justifies working around a limitation of the model. It's not available in, say CBC, and then you need to worry about underflow on your subtraction and everything.

chris-wood commented 6 years ago

FWIW, I would vote for option #1, despite #2 being an "easier" path forward.

cjpatton commented 6 years ago

Could you clarify, @davidben? I think you mean that the implementation SHOULD know the length of the plaintext before encrypting, so authenticating the length field is actually not a problem. Did I understand you right?

davidben commented 6 years ago

No, when encrypting, the implementation of course knows the length of the plaintext. When decrypting, recovering the length of the plaintext is a nuisance if one ever adds an AEAD that, say, rounds inputs up to some block size.

cjpatton commented 6 years ago

Oh I see what you're getting at. The problem we discussed with @ekr was more about encryption. Let me state the original proposal as a third option:

The problem is that one would need to know length before calling AEAD encryption, since length is one of its inputs; hence, one would need to know how much plaintext is being encrypted before beginning to encrypt. I can imagine option (1) would be useful if the plaintext were huge, and keeping it all in memory were impossible. But TLS records are short (2^14), so perhaps this particular use case may be ignored.

ilaril commented 6 years ago

With option (1) some implementation could just do something like (since the AD is always fixed):

let ciphertext_size = self.key.encrypt2(&_iv, &[23,3,3], InputOutputBuffer::Single(&mut buf[range.start..], opktsize + 1), 0).map_err(|_|())?;

And:

self.key.decrypt2(&_iv, &[23,3,3], InputOutputBuffer::Single(buffer, bufferlen)).map_err(|_|())?

Which obviously does nothing useful for security...

cjpatton commented 6 years ago

It sounds like folks would go for either option (1) or (3). I don't see a compelling reason to not authenticate the length field, especially in light of @davidben's points. Unless there are objections (let me know if there are!), I'll be putting together a PR for option (3). It'll be up on Friday.

davidben commented 6 years ago

It sounds like we are trading off a trivial but mildly annoying piece of implementation complexity to reduce what seems to be a fairly equally trivial but mildly annoying piece of analysis complexity, right? If something this irrelevant is actually a huge analysis problem, it honestly seems to me like the analysis tools here are still immature. We seem to all agree that "obviously" irrelevant bits of framing are indeed irrelevant, and we do indeed act on that belief everywhere else. For instance, there's TCP framing that's not authenticated. When TLS is embedded in QUIC, we don't authenticate QUIC framing within TLS. But because this framing happens to be specified in the TLS document, the requirements change?

They're both trivial, but it seems to me the multiplicative cost to implementation complexity is higher than the analysis complexity. There are many implementations of a protocol and that code must continue to be modified and worked on in the future.

davidben commented 6 years ago

More explicitly: I am not in favor of (1) or (3). It's not a strong preference, but I think I favor do nothing over them, and I don't hugely care between (2) and doing nothing.

cjpatton commented 6 years ago

Noted, thanks for clarifying :) I wouldn't characterize this has a "huge analysis problem;" something needs to be done, but option (2) suffices, and I would definitely go for minimizing implementation complexity.

davidben commented 6 years ago

Incidentally, AES-GCM, ChaCha20-Poly1305, and AES-CCM already encode the input length into the authenticated portion anyway. :-)

ekr commented 6 years ago

I think I'm still leaning towards (1), because it probably is useful to be able to check non-tampering with the DTLS framing and DTLS and TLS should be the same. It's a pretty small amount of implementation complexity IMO.

On Mon, Feb 19, 2018 at 12:52 PM, David Benjamin notifications@github.com wrote:

Incidentally, AES-GCM, ChaCha20-Poly1305, and AES-CCM already encode the input length into the authenticated portion anyway. :-)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tlswg/tls13-spec/issues/1145#issuecomment-366800169, or mute the thread https://github.com/notifications/unsubscribe-auth/ABD1oToQNKi4hW5p4_E8Y1GZhEDIF5k3ks5tWd8rgaJpZM4SBHHj .

martinthomson commented 6 years ago

@ekr, you mean (3), right? (1) doesn't add the length into the AAD, and it seems like if we are adding some AD, we might as well include the entire record header.