dominictarr / private-stream

MIT License
9 stars 1 forks source link

Cipher choice #4

Open calvinmetcalf opened 9 years ago

calvinmetcalf commented 9 years ago

I'd recommend my chacha library over my salsa20 as its better tested and includes an authentication mechanism. The node aes-gcm might also fit your requirements

calvinmetcalf commented 9 years ago

So aes is going to be faster than chacha if it's implimented at a lower level, I.e in c (like in node) or in hardware (node on some chips) When implimented at the same level (like in the browser) chacha is unlikely to be slower.

If you just need a fully streaming cipher aes in ctr mode could do the trick, the node and browserify versions do not do any block size buffering.

If you want to use a cipher which also makes sure the message wasn't tampered with then aes-gcm or chacha20/poly1305 are both good choices.

Both are standards and gcm is going to be faster when the underlying hash function is (ghash) us implimented in hardware (Intel chips) otherwise poly1305 will be faster. That being said you'd have to restructure the lib to take advantage of an authenticated cipher

dominictarr commented 9 years ago

Ah I didn't see that, because it's combined with poly1305. I often hear these mentioned in the same breath but I don't understand why they are so closely related. I don't think I need authenticated encryption. My understanding: if we have the same key, and someone inserts some noise into my signal, instead of decrypting nonsense, you'll know detect that it was tampered with - Question: can you detect that at any point in the stream? or only when the stream has completed? (only the former is useful for realtime)

p2p protocols generally have their own authentication at the application level, so an attacker cannot tamper with even plain text communication. This is necessary in order to make a p2p protocol work, but also means constraints in higher layers can be relaxed.

dominictarr commented 9 years ago

I had used aes previously, but it didn't work because of blocking, but I was using cbc, and not gcm, so maybe that was it?

calvinmetcalf commented 9 years ago

Yes cbc blocks.

Now with authenticated encryption the issue is that a flipped byte in the cipher text results in a flipped byte in the decrypted text so you wouldn't necessarily be able to tell if the data was tampered with.

Since authentication is per message converting to that method would be non trivial, so just be aware of the threat, I have done some experimenting in basically treating each write call as a message with a length specified, a message authentication code and a message, and that seems promising

On Sat, Mar 7, 2015, 1:50 PM Dominic Tarr notifications@github.com wrote:

I had used aes previously, but it didn't work because of blocking, but I was using cbc, and not gcm, so maybe that was it?

— Reply to this email directly or view it on GitHub https://github.com/dominictarr/private-stream/issues/4#issuecomment-77703616 .

dominictarr commented 9 years ago

Ah, the authenticated stream seems like it should/could be another layer. authentication is the glass vault - security, but not privacy, then you paint over that - which is the cihper: adding privacy with another layer.

A network layer that preserves the write boundry (a framed transport) is can be considered a special case of one that does not (sometimes an unframed transport will happen to emit the same chunks as you wrote, anyway) so anything that works over an unframed transport (i.e. something that adds it's own framing - a pretty good idea) will also work over a framed transport (just there will be redundant framing)

calvinmetcalf commented 9 years ago

I wrote something like that https://github.com/calvinmetcalf/hmac-stream/pull/3, I need to finish simplifying it and documenting the changes but it basically just adds some simple framing of adding a length and an hmac and on the other side checks. Just treating any call to write as a message (though I wrote https://github.com/calvinmetcalf/SBS for some simple synchronous buffering).

The idea being you'd pass the crypto stream through this and probably just encrypt a zero buffer and use it as a key.

On Mon, Mar 9, 2015, 4:05 AM Dominic Tarr notifications@github.com wrote:

Ah, the authenticated stream seems like it should/could be another layer. authentication is the glass vault - security, but not privacy, then you paint over that - which is the cihper: adding privacy with another layer.

A network layer that preserves the write boundry (a framed transport) is can be considered a special case of one that does not (sometimes an unframed transport will happen to emit the same chunks as you wrote, anyway) so anything that works over an unframed transport (i.e. something that adds it's own framing - a pretty good idea) will also work over a framed transport (just there will be redundant framing)

— Reply to this email directly or view it on GitHub https://github.com/dominictarr/private-stream/issues/4#issuecomment-77813442 .

dominictarr commented 9 years ago

Where would you get the secret for the hmac? I figure you could use the hash is the private stream secret? If this was actually signed (and a signed auth is used in ssb) then it would protect against mitm. The mitm could create two sessions with two dh exchanges, but since each end will derive different dh keys so authentication will fail.

We just have to solve key management, but since a peer id is usually hash(pubkey) that isn't a problem for p2p. On 9 Mar 2015 23:36, "Calvin Metcalf" notifications@github.com wrote:

I wrote something like that https://github.com/calvinmetcalf/hmac-stream/pull/3, I need to finish simplifying it and documenting the changes but it basically just adds some simple framing of adding a length and an hmac and on the other side checks. Just treating any call to write as a message (though I wrote https://github.com/calvinmetcalf/SBS for some simple synchronous buffering).

The idea being you'd pass the crypto stream through this and probably just encrypt a zero buffer and use it as a key.

On Mon, Mar 9, 2015, 4:05 AM Dominic Tarr notifications@github.com wrote:

Ah, the authenticated stream seems like it should/could be another layer. authentication is the glass vault - security, but not privacy, then you paint over that - which is the cihper: adding privacy with another layer.

A network layer that preserves the write boundry (a framed transport) is can be considered a special case of one that does not (sometimes an unframed transport will happen to emit the same chunks as you wrote, anyway) so anything that works over an unframed transport (i.e. something that adds it's own framing - a pretty good idea) will also work over a framed transport (just there will be redundant framing)

— Reply to this email directly or view it on GitHub < https://github.com/dominictarr/private-stream/issues/4#issuecomment-77813442

.

— Reply to this email directly or view it on GitHub https://github.com/dominictarr/private-stream/issues/4#issuecomment-77831070 .

calvinmetcalf commented 9 years ago

aes-gcm and chacha20/poly1305 just encrypt 128bits of zeros first thing and use that as an hmac key, which makes guessing the key as hard as breaking the cipher (chacha20/poly1305 actually encrypts 256 bits and uses the first 128 discarding the next 128, I don't remember why off the top of my head).

On Mon, Mar 9, 2015 at 4:21 PM Dominic Tarr notifications@github.com wrote:

Where would you get the secret for the hmac? I figure you could use the hash is the private stream secret? If this was actually signed (and a signed auth is used in ssb) then it would protect against mitm. The mitm could create two sessions with two dh exchanges, but since each end will derive different dh keys so authentication will fail.

We just have to solve key management, but since a peer id is usually hash(pubkey) that isn't a problem for p2p. On 9 Mar 2015 23:36, "Calvin Metcalf" notifications@github.com wrote:

I wrote something like that https://github.com/calvinmetcalf/hmac-stream/pull/3, I need to finish simplifying it and documenting the changes but it basically just adds some simple framing of adding a length and an hmac and on the other side checks. Just treating any call to write as a message (though I wrote https://github.com/calvinmetcalf/SBS for some simple synchronous buffering).

The idea being you'd pass the crypto stream through this and probably just encrypt a zero buffer and use it as a key.

On Mon, Mar 9, 2015, 4:05 AM Dominic Tarr notifications@github.com wrote:

Ah, the authenticated stream seems like it should/could be another layer. authentication is the glass vault - security, but not privacy, then you paint over that - which is the cihper: adding privacy with another layer.

A network layer that preserves the write boundry (a framed transport) is can be considered a special case of one that does not (sometimes an unframed transport will happen to emit the same chunks as you wrote, anyway) so anything that works over an unframed transport (i.e. something that adds it's own framing - a pretty good idea) will also work over a framed transport (just there will be redundant framing)

— Reply to this email directly or view it on GitHub <

https://github.com/dominictarr/private-stream/issues/4#issuecomment-77813442

.

— Reply to this email directly or view it on GitHub < https://github.com/dominictarr/private-stream/issues/4#issuecomment-77831070

.

— Reply to this email directly or view it on GitHub https://github.com/dominictarr/private-stream/issues/4#issuecomment-77933846 .

dominictarr commented 9 years ago

Right, so it uses the first N bits of the ciphertext as the hmac key.

If you separated out the poly1305 and used as a plaintext authenticated stream, it would effectively send the key as plain text, the key would be known to an eavesdropper, so they could generate valid messages (which they could attack with, assuming they could inject discrete messages)

So, you wouldn't really call an hmac with a public secret secure, however if you encrypt that, the attacker doesn't know where the messages are, or what the message boundries are, or what the secret to the encryption is. All they can do is randomly flip bits, in which case it does provide security if it's used inside a layer that adds privacy.

If you wanted an hmac stream that added security but not privacy (hypothetically*) you'd have to make the key a secret, with DH exchange, or signatures etc.

* most p2p protocols without encryption (bitcoin, bittorrent, ssb until encryption is implemented do have something like this)

calvinmetcalf commented 9 years ago

I don't know if we're just talking about things in different ways but its not the first n bits of ciphertext the first n bits of keystream, so take a those first n bits and instead of concating them with the first bits of the plain text to get the ciphertext you use then as the key to the hmac.

So, you wouldn't really call an hmac with a public secret secure, however if you encrypt that, the attacker doesn't know where the messages are, or what the message boundries are, or what the secret to the encryption is. All they can do is randomly flip bits, in which case it does provide security if it's used inside a layer that adds privacy.

An attacker will likely be able to guess message boundaries based on context, e.g. an encrypted chunk of data is likely is one or more whole messages (because we probably don't want to buffer it too much because we don't want to hold up the stream) with either a header or a trailer or both. If they know anything about the protocol (which will be likely as you need to communicate details about it to the receiver) the attacker can probably make an educated guess about where the message starts.

Now you hope that bits flipped aren't going to make any differences but certain errors are only exploitable when you decrypt before authenticating (i.e. not checking padding) and while ctr or salsa don't have these errors who knows what the next one will be.

dominictarr commented 9 years ago

oh sorry, I meant the keystream. though, if you are encrypting 0's and XORing the cyphertext and keystream will be the same at that place.

good point about the message boundries. Lets assume that the attacker knows the message boundries.

One question: how is it not better to pick a random number for the hmac key? I have a feeling that it's no better - but I havn't wrapped by brain around it yet. if you use the start of the keystream, then you are leaning on the RNG which generated the key for the secret, but not using the key directly. it seems that an attacker can probably guess the hmac key, since the start of the plain text might be guessable (common headers etc). so assuming they know the hmac key, and the message boundries, the thing that stops them from being able to inject anything is that they do not know the encryption key, so they cannot create a encrypted valid message. So the privacy layer is part of the security layer. But if the hmac key was secret, then the security layer could stand on it's own. So, you could use the secret key for the hmacs as well, or hash(key) instead so that the key itself is not passed around.

That would be make the layers less coupled, which seems like a good design to me. Is there a good reason for doing it the way it is, or is it some optimization of something?

calvinmetcalf commented 9 years ago

you'd discard the key stream you use for making the hmac key and not use it to encrypt anything, the idea being you don't want to use the exact same key and you don't want to send the hmac key in anyway. you could always just use whatever method you're using to generate the key to generate twice as much data (e.g. by just putting it through pbkdf2) and use the other half for the hmac key

On Tue, Mar 10, 2015 at 4:53 AM Dominic Tarr notifications@github.com wrote:

oh sorry, I meant the keystream. though, if you are encrypting 0's and XORing the cyphertext and keystream will be the same at that place.

good point about the message boundries. Lets assume that the attacker knows the message boundries.

One question: how is it not better to pick a random number for the hmac key? I have a feeling that it's no better - but I havn't wrapped by brain around it yet. if you use the start of the keystream, then you are leaning on the RNG which generated the key for the secret, but not using the key directly. it seems that an attacker can probably guess the hmac key, since the start of the plain text might be guessable (common headers etc). so assuming they know the hmac key, and the message boundries, the thing that stops them from being able to inject anything is that they do not know the encryption key, so they cannot create a encrypted valid message. So the privacy layer is part of the security layer. But if the hmac key was secret, then the security layer could stand on it's own. So, you could use the secret key for the hmacs as well, or hash(key) instead so that the key itself is not passed around.

That would be make the layers less coupled, which seems like a good design to me. Is there a good reason for doing it the way it is, or is it some optimization of something?

— Reply to this email directly or view it on GitHub https://github.com/dominictarr/private-stream/issues/4#issuecomment-78015580 .

dominictarr commented 9 years ago

Hmm, there are now two keystreams? but one you encrypt some value with, then take the output, and use that as the hmac key... So (i think I get it now) you are using the cipher as a hash? to turn the secret into an hmac key, which doesn't need to be sent, because the remote side can derive the correct hash too. I'm guessing it only uses the cipher to do this so it can avoid adding another dependency?

can you direct me to a good document for learning how/why chacha20/poly1305 works? I think I need to understand this properly.

calvinmetcalf commented 9 years ago

The spec is pretty straight forward https://tools.ietf.org/html/draft-irtf-cfrg-chacha20-poly1305-10

On Tue, Mar 10, 2015, 6:42 PM Dominic Tarr notifications@github.com wrote:

Hmm, there are now two keystreams? but one you encrypt some value with, then take the output, and use that as the hmac key... So (i think I get it now) you are using the cipher as a hash? to turn the secret into an hmac key, which doesn't need to be sent, because the remote side can derive the correct hash too. I'm guessing it only uses the cipher to do this so it can avoid adding another dependency?

can you direct me to a good document for learning how/why chacha20/poly1305 works? I think I need to understand this properly.

— Reply to this email directly or view it on GitHub https://github.com/dominictarr/private-stream/issues/4#issuecomment-78164513 .

calvinmetcalf commented 9 years ago

so an aproximate and non-streaming example would be something like

function encrypt(key, iv, plainText){
    var cipher = crypto.createCipheriv('aes-256-ctr', key, iv);
    var zeros = new Buffer(32);
    zeros.fill(0);
    var hmacKey = cipher.update(zeros);
    var hmac = crypto.createHmac('sha256', key);
    var ciphertext = cipher.update(plainText);
    var authTag = hmac.update(ciphertext).digest();
    return {
         ciphertext: ciphertext
         authTag: authTag
    };
}
dominictarr commented 9 years ago

oh, so it uses the cipher to get the hmac key, and then the rest of the cipher to actually encrypt. So this is a optimization to avoid allocating memory for another hash? It's a bit more difficult to understand because the authenticator and cipher are tightly coupled.

calvinmetcalf commented 9 years ago

In the case of poly1305 (or other non hmac based macs) it allows you to avoid using a hash at all, you could I this example hashed the key and iv together and had a similar level of security (hash function must be broken to compromise mac) but in poly1305 or ghash based one uses a hash to derive the key would mean a break in either the hash or Mac could be a compromise.

If you were paranoid you could take the hash of the key and iv, and then encrypt that instead of zeros.

On Wed, Mar 11, 2015, 5:12 PM Dominic Tarr notifications@github.com wrote:

oh, so it uses the cipher to get the hmac key, and then the rest of the cipher to actually encrypt. So this is a optimization to avoid allocating memory for another hash? It's a bit more difficult to understand because the authenticator and cipher are tightly coupled.

— Reply to this email directly or view it on GitHub https://github.com/dominictarr/private-stream/issues/4#issuecomment-78372769 .