keybase / triplesec

Triple Security for the browser and Node.js
https://keybase.io/triplesec
MIT License
399 stars 48 forks source link

Streaming interface #15

Open maxtaco opened 11 years ago

maxtaco commented 11 years ago

For command-line file encyptor.

andrew-d commented 10 years ago

@maxtaco I'm interested in getting this working - I need a JavaScript encryption library with a streaming interface, and I like the security model behind TripleSec. Is there any plans to implement this? I'd also be okay with trying to implement this, but I'd rather see what your thoughts are first.

maxtaco commented 10 years ago

I've been slammed so I don't think I'll get to it soon. But PR's are welcome. Thank you!

andrew-d commented 10 years ago

Thanks for the quick reply! Any preferences on how you'd like this implemented, before I start anything?

maxtaco commented 10 years ago

No strong preferences. A possible suggestion is to follow the node Transform stream pattern, which should "just work" via browserify, but I haven't tested it. Thanks Andrew.

calvinmetcalf commented 10 years ago

the current output of version.header, salt, signature, data isn't something that can be turned into a stream in an identical way, (e.g. same output as the callback version but you get it in a stream) because you need all of the encrypted data before you can calculate the signature, it might make more sense for a stream to output the data and then output the headers at the very end.

maxtaco commented 10 years ago

Agreed. Encrypting needs rewind but decryption doesn't. On Oct 13, 2014 11:48 AM, "Calvin Metcalf" notifications@github.com wrote:

the current output of version.header, salt, signature, data isn't something that can be turned into a stream in an identical way, (e.g. same output as the callback version but you get it in a stream) because you need all of the encrypted data before you can calculate the signature, it might make more sense for a stream to output the data and then output the headers at the very end.

— Reply to this email directly or view it on GitHub https://github.com/keybase/triplesec/issues/15#issuecomment-58911699.

calvinmetcalf commented 10 years ago

though I'm not sure how feasible an actual rewind will be without putting it all in memory

dominictarr commented 9 years ago

or you could use multiple headers, i.e. a framed protocol. for example like @calvinmetcalf's https://github.com/calvinmetcalf/hmac-stream/ or my https://github.com/dominictarr/pull-box-stream

SparkDustJoe commented 9 years ago

miniLock suffers from this same problem of there being data in the header at the top of the file that isn't known until the cipher-text has been created "somewhere" in full. Changing the header to be "magic bytes" + version + salt + AES IV + CT + signatures at the end, in either a raw byte stream or JSON, would do it for TripleSec on the encryption side as far as streaming the output. For decryption having the keys up front and doing some calculations as the file came down and then doing the decryption when enough has been downloaded (since this protocol is layered, and the IV's are posted up-front), and then calculating the signatures as a last process would be faster too.

SparkDustJoe commented 9 years ago

think of it like putting down three layers of tape. The bottom-most layer doesn't have to be fully placed for the next layer to start being applied on top, and then the next layer of top of that as the lower two are placed ahead. All of the IV's are known ahead of time, so as one layer is encrypted, past a certain point the next layer with the IV affixed to the beginning can be encrypted, and finally the last. At the end the signatures are generated on the last layer of "tape" (cipher-text), and sent as the last output, but you can start that process at the point where the three layers of "tape" are stuck together (but before you reach the end of the line). In this manner, with clever coding, you can buffer the whole process without loading the entire file into memory!

SparkDustJoe commented 9 years ago

AES and Twofish have a 128 bit block width, but XSalsa20 has a 512 block width (which is exactly 4 times the other two but you have to factor in the 128 bit IV for TwoFish and the 160 bit IV for Salsa). Decrypting the first 7 blocks of data from the beginning (which strips off the AES layer) gives you enough to start the process.

calvinmetcalf commented 9 years ago

AES in CTR mode, xsalsa and twofish are all streaming so the blocksize is irreverent, the bigger issue is that you don't want to decrypt unverified data so buffering is necessary to get all the cipher text so it can be verified before decryption so having streaming decrypting is pretty much unnecessary since you already have things in memory.

That being said doing a framing approach where you chunk your message and encrypt each of them separately would be useful except there is a tremendous amount of per message scrypt overhead since you do (password + salt) -> key 5 separate times having a method to reuse the keys and create new messages quickly without having to re derive the keys might be more practical for streaming. Reusing the salt might be one way to do it

SparkDustJoe commented 9 years ago

It's not that <(password + salt) -> key> is run 5 times, the same run produces a long byte array that CONTAINS 5 keys (3 for encryption, 2 for signatures, all done at once). But I see your point with trying to decrypt something that might be invalid to begin with, no sense in doing all that work if the stream is corrupt. But if the signatures are at the end, then you still could at least feed the cipher-text stream into the hashes as it comes down, verify, and THEN decrypt from memory. Not sure how much time this saves with smaller files, but larger ones it could make a difference.

In miniLock (which only uses one encryption pass), there is a chunk number worked into the nonce of the files as they are encrypted with XSalsa20-Poly1305 (which the Poly1305 part isn't used here). Something to that effect could be used where the MASTER keys are generated from <password + salt> but then are HMAC'd with <chunk number + IV> or <chunk number XOR IV>, so that the keys for each chunk are dependent on which chunk you are talking about and still tied to the scheme so that forgery is still incredibly difficult. Each chunk doesn't need authentication per se if the WHOLE of the cipher-text is authenticated on download. miniLock caps each chunk to 1MB.

As far as the CTR mode, there is a certain amount of information you have to have first to get the process started, so maybe it's less than the 7 blocks that I stated above, but there is still some work to be done peeling away layers to get to the IV's of the "lower" functions on decrypt if you wanted to do them concurrently. The block size only matters in the sense of you'll have different length "strips" of bytes that come out of the functions, so some will be run more times that others.

calvinmetcalf commented 9 years ago

But I see your point with trying to decrypt something that might be invalid to begin with, no sense in doing all that work if the stream is corrupt.

more then that there are attacks that work by passing in invalid data and depending on the error being able to gain information

SparkDustJoe commented 9 years ago

Since the CTR mode as no authentication inherent to it (unlike with Poly1305), the system probably wouldn't know until the very end (or if the signatures failed) unless it was doing some kind of "header" or "leader" check on the decrypted data, so it would depend on how TripleSec was used in a larger system, but yeah that's also a very valid point.

calvinmetcalf commented 9 years ago

Well of triplesec does the authentication then there is no reason to add multiple levels of it latter in the scheme

On Thu, Jun 25, 2015, 12:14 PM Dustin notifications@github.com wrote:

Since the CTR mode as no authentication inherent to it (unlike with Poly1305), the system probably wouldn't know until the very end (or if the signatures failed) unless it was doing some kind of "header" or "leader" check on the decrypted data, so it would depend on how TripleSec was used in a larger system, but yeah that's also a very valid point.

— Reply to this email directly or view it on GitHub https://github.com/keybase/triplesec/issues/15#issuecomment-115307112.

FlorianWendelborn commented 8 years ago

Any news on this? It would be amazing to have a secure, streamable cipher.

maxtaco commented 8 years ago

We are working on one now with saltpack! See our work over in node-saltpack.

On Sunday, July 24, 2016, Florian Wendelborn notifications@github.com wrote:

Any news on this? It would be amazing to have a secure, streamable cipher.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keybase/triplesec/issues/15#issuecomment-234810924, or mute the thread https://github.com/notifications/unsubscribe-auth/AA05_5hX5GtNnzd7tklk0tZR_Xpl5zL7ks5qY_k8gaJpZM4BAbxK .

FlorianWendelborn commented 8 years ago

@maxtaco If I understand this correctly, it's using asymmetric cryptography. That's nice for messaging, but unfortunately isn't good for en- and decrypting files, without sharing them.

maxtaco commented 8 years ago

You're right of course! It's not currently on our road map. A hack is to take a 32-byte secret key and to treat that as a Curve25519 private key, and then encrypt to yourself. But it's inelegant and adds an additional assumption/dependence, that Curve25519 isn't broken....