keybase / triplesec

Triple Security for the browser and Node.js
https://keybase.io/triplesec
MIT License
399 stars 48 forks source link

move hashes to end of cipher #40

Open calvinmetcalf opened 9 years ago

calvinmetcalf commented 9 years ago

in order to make it easier to stream the message (will likely require a version bump)

calvinmetcalf commented 9 years ago

or break it up into sub messages that are authenticated separately:

instead of

<header><salt><iv1> <iv2> <iv3> <hmac><hmac><msg>

something like

<header><salt><iv1> <iv2> <iv3> <hmac>  <hmac> <msg> <hmac> <hmac><msg>...
SparkDustJoe commented 9 years ago

I would agree that having the macs at the end makes sense for better streaming, although Calvin has the actual pieces a bit mixed up.

We know they are of a fixed size (64 bytes each) and calculated on the same content: HMAC(<header><salt><iv_aes><ciphertext>, <key>) Having the HMACs be calculated with the data in one orientation but stored in another is a little awkward.

So instead of in the file like this (currently): <header><salt><hmac><hmac><iv_aes><ciphertext> (edited for correctness)

... storing them like this makes for better flow during encryption output and decryption input: <header><salt><iv_aes><ciphertext><hmac><hmac> ... and they are still easy to find without needing a pointer in the header to tell us the starting byte (just look at the length - 128 bytes).

-OR-

<header><hmac><hmac><salt><iv_aes><ciphertext> ... because <header> is always a constant 8 bytes, and we can do the HMACs in one continuous read of the rest of the data after the actual HMAC values (picking up the salt and the AES IV along the way for processing after making sure the ciphertext is "intact").

calvinmetcalf commented 9 years ago

@SparkDustJoe <header><hmac><hmac><salt><iv_aes><ciphertext> would prevent streaming as well ...

The problem I realized with putting them at the end for streaming is that you consume it and verify it at the end and if it's invalid hope you can roll back any issues that the invalid data caused.

I was suggesting in the second comment to do an incremental hmac so that you can check the integrity of the first bit without having access to the last bit.

SparkDustJoe commented 9 years ago

If you are reading the data from the stream to verify it, then don't decrypt until you verify it, but then you have to read the stream twice if you don't have a lot of memory. It's not ideal in either case. If you do HMACs for each piece like a BitTorrent file, then your ciphertext increases in size (in this case by 64 bytes, or x2 that if you still do both SHA512 and Keccak/SHA3).

miniLock does this with XSalsa20-Poly1305; every message piece (capped at 1MB) is authenticated at the time it is decrypted; one bad piece kills the whole thing. BUT that adds 16bytes per piece to the overall size of the message.

Since TripleSec is encrypting data 3 times and XSalsa20 is in the inner-most layer, you would have had to decrypt two layers first to see it, or wrap the Poly1305 scheme around AES. This would effectively eliminate the need to do 2 HMACs at the end for integrity (as each piece has its own).

calvinmetcalf commented 9 years ago

using poly1305 would probably only eliminate the need for one of the hmacs to continue the idea of cryptographic redundancy

gburtini commented 7 years ago

@calvinmetcalf do you have a use-case for streaming that does not involve consuming the stream in an online fashion?

It would be a violation of the purposes of authenticity/integrity to consume the message before validating the HMAC, as such it should not benefit you to have the hash at the end.

With that in mind, I suggest we close this issue unless Calvin or another user has another use case. This isn't currently a streaming appropriate cipher and #15 is a duplicate for discussion of making it streaming appropriate.

SparkDustJoe commented 7 years ago

In order to validate the macs in this case the entire ciphertext has to be read (the keys can be calculated from the header). The ciphertext doesn't have to be decrypted until validated, but it all has to be consumed first into the hash algorithms, and then you can go back to whatever storage media (memory, disk, cache, etc.) to use the keys and actually start decrypting. For writing the files, the first parts of the header and ciphertext are known as they are being processed, but the macs are not. So the whole thing has to be buffered, then the header fully created once the macs have been calculated, and then the ciphertext. Reading the file, the order is less critical as buffering has to be done if the validation has to happen first; the order is much more critical on write.

SparkDustJoe commented 5 years ago

This didn't make it into V4 #51 so I'm going to bump it for V5 #72 . I still think this is a valid point to explore.