Open adinapoli opened 9 years ago
Hey @paragonie-scott !
The reason is not implemented is simply because the current API this library provides is a streaming one: it means the encrypted data is read in chunks and in constant memory. This makes the library capable of handling arbitrarly-large files with a very small memory footprint.
This came with a cost though: due to the fact the HMAC validation needs to be computed on the whole input, doing so would defeat the purpose of doing streaming IO.
I have no clue whether is possible to do hmac validation by reading the input as you go, and I will accept pointers in that direction.
An easier option would be to expose another function that performs the decryption by reading the whole input into memory (like the Python implementation does), so we can easily implement hmac validation in there.
EDIT: ah, that's exactly what 'decrypt' already is, we just need hmac validation for that.
@soatok Ah! So it turns out is possible, after all, to do an HMAC validation on a stream!
Sounds good to me. What I will do in order to fix this ticket:
decrypt
, as that functions operate on the whole content in memory.streamVerify
as reference implementation and "port it" to Haskell.Excellent. Pay careful attention to what Halite does though, because it's kind of important.
The motivation of this construction is to remove the possibility for TOCTOU attacks. If someone modifies the source of your stream (i.e. the file), one of several things will happen:
In either case, it's a bit more complicated than simply Verify then Decrypt, but the extra tinfoil is worth it.
@soatok Thanks a lot for the detailed explanation! It will help a lot during the implementation :wink:
BTW, the Swift impl is streaming. Take a look at OverflowingBuffer. It takes a stream and returns a stream that ends N bytes early, by reading into buffer. When you get to the end, you can ask for the leftover bytes.
@soatok, the problem with the approach you're describing (if I'm reading it correctly) is that it is unbounded in memory. You can't rewind a network stream unless you're going to write it to disk first. Even if it were written to disk, since it accumulates partial HMACs in memory, it's still unbounded in memory so subject to DoS. This all may be reasonable for decrypting files, but it seems bad for both performance and security on network streams.
While I get the TOCTOU concern you're describing, this isn't something that can be correctly solved inside the decryptor in my opinion. The correct solution depends on the caller's use case. In some cases, this solution makes sense; in others it creates its own security problems (memory exhaustion). And in others it is simply inefficient without providing security (if I decrypt the whole message before processing any of it, so I can discard it if it turns out in the end to be corrupt).
Am I missing something more subtle?
I've been thinking about the DoS concerns I raised, and while it's true that the HMAC storage is unbounded, it is pretty trivial to scale it such that they could never actually become a problem. 32-bytes (plus a little overhead) per megabyte would require truly enormous amounts of traffic to exploit. So I still believe the large performance and storage cost is not worth it for many use cases, I shouldn't say it reduces security.
While I get the TOCTOU concern you're describing, this isn't something that can be correctly solved inside the decryptor in my opinion.
The implementation Halite follows is literally specific for files. Network streams are their own beast. (Poly1305 and GCM are great here!)
Agreed about Authenticated Encryption like GCM. Wish I could use it. RNCryptor intentionally only uses modes that are widely available so we can use OS-provided crypto. Apple (and pretty much every other OS provider) gives us no AE options.
As it currently stands, you're discarding the MAC. This means the library is vulnerable to chosen-ciphertext attacks. :(