jedisct1 / libsodium

A modern, portable, easy to use crypto library.
https://libsodium.org
Other
12.26k stars 1.74k forks source link

All implementations of `stream_ietf_ref_xor_ic` may produce incorrect output for large `ic` #753

Closed stouset closed 6 years ago

stouset commented 6 years ago

I discovered this while independently trying to implement vectorized ChaCha20, running into a conceptual problem, and looking to see how libsodium's implementations handle this. To my great surprise, they don't!

The core problem is that DJB ChaCha20 uses a 64-bit counter and the IETF version uses a 32-bit counter. All implementations (ref, dolbeau) implicitly treat this counter as 64 bits.

The docs do say that the IETF variant is limited to 256GB (equivalent to 2^32 64-bit blocks). In the crypto_stream_chacha20_ietf case, this limit is not enforced (it's compared to crypto_stream_chacha20_MESSAGEBYTES_MAX, which is generally set to UINT64_MAX). In the crypto_stream_chacha20_ietf_xor_ic case, there's a much easier opportunity for misuse: if ic + (mlen / 64) exceeds 2^32 (even if mlen by itself is well below the documented limit of 256GB), the counter will overflow and be treated as if it were 64-bits, producing incorrect output.

First, the documentation should be improved to point out the limitation in the stream_ietf_ref_xor_ic call. Second, the code implementing the IETF versions should explicitly check that ic + (mlen / 64) does not overflow the 32-bit counter.

jedisct1 commented 6 years ago

Hi Stefen,

As the name implies _ic functions set an internal counter. That value is not additional data, not a secret nonce, not a message id as in libhydrogen either. It should only be used to seek within the limits of what stream_xor() would safely encrypt.

stouset commented 6 years ago

I agree! It's used to set the initial value of the counter, so, you can e.g., start decrypting a message somewhere other than the beginning.

However, it needs to be clearer in the docs that mlen / 64 + ic must never exceed 2^32 or incorrect output will be returned. And it would be significantly better if the _ietf variants would explicitly indicate an overflow of ic to the user through a call to sodium_misuse(). Quietly emitting incorrectly-encrypted data is at best an interoperability problem, and at worst can result in catastrophically leaking plaintexts.

The latter can happen if users use counter-based nonces. Given two messages encrypted with nonces nonce_0 and nonce_1, if stream_ietf_ref_xor_ic is called with nonce_0 and ic, mlen such that an overflow happens, the keystream from nonce_1 is repeated.

This isn't necessarily surprising if you understand how it's implemented under the covers, and the documentation warns against exceeding 256GB of plaintext, but doesn't warn that the maximum acceptable value of mlen is decreased as ic increases.

To make matters worse, the docs indicate that ic is a 64-bit counter, when it should be (and thankfully is, in code) a 32-bit counter.

image

jedisct1 commented 6 years ago

Prototype fixed in the documentation, thanks!

It's perfectly fine to overflow the IV in the xchacha20 and secretstream constructions, since the probability of a collision in the scenario you describe is negligible.

So, really, if you are using the _ic functions, you are on your own. This is even lower-level than the low-level API, and these functions should probably completely go away at some point, at least from the public interface.

[edit: are -> is]

stouset commented 6 years ago

Even then, the non-ic _ietf functions should call sodium_misuse() when called with mlen > 512 * 2^32 (actually, slightly less: 247,877,906,880 bytes according to the RFC).

And I don't think it's fine to overflow the IC in those situations, since it will cause silent future interoperability problems.

jedisct1 commented 6 years ago

Would you like to take over the project?

(serious offer)

jedisct1 commented 6 years ago

I tried to make the changes, and it opened a can of worm.

Unless we want to make the code significantly more shitty than it currently is, I'd rather document that the counter overflows into the IV.

jedisct1 commented 6 years ago

Fixed but I utterly hate that change.

stouset commented 6 years ago

Sorry, I was away from my laptop for the weekend. I would have actually been more than happy to help, apologies I wasn't able to do so before you got around to it.

I sadly agree that the current implementation makes implementing this change super shitty. :(

stouset commented 6 years ago

One possible alternate approach to the current one is to allow overflowing the counter, but change chacha20_encrypt_bytes so that it returns a boolean to indicate whether or not overflow happened. The IETF variant can then call sodium_misuse() after the fact, which may be a less disruptive change.

jedisct1 commented 6 years ago

But we would still need to bubble up this information to crypto_aead and crypto_secretstream, and can't really change the public crypto_stream API.

The current code is okay, the distinction between ietf_ext and ietf somewhat improves clarity.

Code duplication in crypto_aead is not great, but <insert shrug kaomoji here>.