ghost commented 7 years ago

Unreliable crypto channels

I've researched the cryptographic channels that work over datagrams in the past few days, and so far there's 4 options, none of which has a really good implementation in go :):)

The reason we need an unreliable crypto channel is that we have strong use cases for being flexible when it comes to modeling reliability. For many cases, reliability at the bottom of the stack (as in: TCP et al) is not an option. Instead, we want datagram transports and layer reliability on top only where needed. Read more about the motiviation in ipfs/notes#143.

Definitions

Initiator: the node initiating the handshake by sending a request.
Responder: the node receiving the handshake request and sending a response.

Desired properties

Handshake responses MUST NOT be bigger than handshake requests. This way the nodes can't be used for amplifying a DDoS attack.
Handshake requests MUST NOT generate state on the remote. This prevents resource exhaustion DoS attacks.
Invalid packets SHOULD be dropped without any response. This hinders debugging problems, but adds "port knocking" known from SSH.
Handshake requests SHOULD include an identifier capable of preventing the teardown of legitimate existing sessions. Wireguard uses a timestamp for this and discards requests which are older than the current session's handshake.
There MUST be a means of protecting against replayed handshake and data packets. One means is a sequence-id and a sliding window.
There MAY be a means of reordering packets that arrived out of order.
For encryption schemes which require a nonce, the packet's respective nonce MUST be included with the packet itself.
The primitives used SHOULD be supported by WebCrypto.

Open questions:

Should we allow piggybacking data on top of handshake packets?

Candidates

DTLS
- There's a prototyped implementation in quic-go which is functional enough for a working QUIC server. (cc @lucas-clemente)
- Apart from that there are DTLS-capable Go bindings to OpenSSL, which is obviously less than ideal.
- Good explanation of the differences to TLS: http://security.stackexchange.com/a/29179
- DTLS needs to support RSA keys and thus came up with its own fragmentation mechanism, since RSA keys hardly fit into regular-sized UDP packets. This appears like needless complexity. We still want to eventually support DTLS though, as it unlocks QUIC and WebRTC.
Noise Protocol Framework
- Wireguard is a kernel-space VPN which came up with a crypto channel over datagrams, based on the Noise framework.
- Its handshake protocol is very promising. One weakness I noticed though is that its replay protection relies on monotonically increasing time. Introducing the concept of time in a distributed system screams for subtle issues.
- Wireguard has excellent documentation: https://www.wireguard.io/protocol
SHS / Secure Handshake
- "[S]ecure key exchange protocol designed with capability systems in mind", used by secure-scuttlebut.
- SHS requires a reliable transport underneath, but could be adapted to work over an unreliable transport. The changes would probably similar to the changes between TLS and DTLS. (cc @dominictarr)
- https://dominictarr.github.io/secret-handshake-paper/shs.pdf
CryptoAuth
- Used by cjdns.
- Handshake is not protected from replays.
- Probably also counts as a capability-based system, as it comes with various schemes for authenticating the initiator with the responder.
- Slightly outdated whitepaper: https://github.com/cjdelisle/cjdns/blob/master/doc/Whitepaper.md#the-cryptoauth

dominictarr commented 7 years ago

my shs implementation uses nacl algorithms, which arn't in webcrypto. but the design could be adapted to use other algorithms. but also, over the lifecycle of a connection, asymmetric crypto is only needed in the initial handshake, and it's mostly symmetric, so you could use aes (instead of chacha), hmac, and sha256 from webcrypto, but leave the signatures as ed25519.

oh, btw, I was talking to felix from ethereum about this the other day... I don't know his github handle, but @wanderer can probably point him here

Kubuxu commented 7 years ago

The TweetNaCL offers Salsa20 and Poly1305 in Js that should be as fast or even faster than Webcrypto AES.

wanderer commented 7 years ago

calling @fjl

fjl commented 7 years ago

Yes, I'm looking into the same topic for Ethereum.

ghost commented 7 years ago

@dominictarr @fjl wanna meet some time next week? Or any other week for that matter, since I live in Berlin :)

jbenet commented 7 years ago

Whatever we do, we have to support at least one IETF recommended construction (Probably DTLS 1.3 or 1.2) and ideally one djb recommended construction.

@dominictarr have you gotten djb to audit shs? Also, what is the suggestd "datagramification" look like? may be useful to draft that up.

Kubuxu commented 7 years ago

IETF is supporting ChaCha20+Poly1305 (RFC 7539) which is equivalent to Salsa20+Poly1305 created by djb. ChaCha20 is offspring of Salsa20 created also by djb.

jbenet commented 7 years ago

@Kubuxu of course. It's beyond just choosing the ciphers-- what i mean is a construction that is djb orignal. the whole TLS and DTLS construction bundles in a lot of other concerns-- usually djb boils protocols down to the essentials needed in a very good way.

Kubuxu commented 7 years ago

I agree, (D)TLS constructions will have a lot of burden from features that we won't use like re-authentication with new identity which means that they will be much more complex.

Lack of complexity is good. If we end up crafting something (which I am not really for, but we might have no other option), I would like it to be as simple as possible without security any trade offs.

Bugs and exploits can't hide in code you don't have.

Kubuxu commented 7 years ago

re 4. Shouldn't be hard, if you prevent session interfering with other sessions until it is fully authenticated, which is very good property.

re 5. Sliding window is must have, I think if it is done similarly to how cjdns does it will be good (TCP over it is more reliable than over raw connections when it comes to bad and/or buffer bloated links).

re 6. Reordering should be done by higher layer, IMO. It means buffering and so on.

re 8. I would include "or be fast in software". Salsa20 from TweetNaCl-fast is faster then AES from webcrypto, and there is probably some magic with web-workers that can be done to off-load main thread. Even if it is in the main thread it still is able to pull out 128MB/s on x86 and 43 MB/s on ARM (my bare metal tests on ARM shown that AES is at least 2x slower than Salsa20).

Also I would prefer if we didn't have to craft it ourself but it might be not an option. I will be very happy if we can do it as collab with Ethereum.

keks commented 7 years ago

I'd like to suggest Signal's/OMEMO's double ratchet. I don't mean the initial handshake with an intermediate server (X3DH), just the logic to constantly change the key material and mix in new entropy.

some thoughts:

there is a formal security proof
google shows Go implementation, don't know quality though
message based
when we don't use x3dh we still need a handshake protocol (shs?)

edit: note the formal security proof is valid only in conjunction with x3dh

Kubuxu commented 7 years ago

Double ratchet uses the ratchet because it stores long term session state and doesn't want the leaked state of session at one point to remove the security of future communication.

In my opinion it is not necessary for our usecase as the session keys won't be stored long term, they are purely ephemeral keys and even though x25519 and x448 operations are much cheaper than RSA they still are not as cheap as pure symmetric encryption after the handshake.

It gets even more complex when you add the fact that packets can arrive out of order.

@keks I might not see it but do you think using the ratchet protocol would have some benefits?

X3DH is quite a interesting and novel concept: https://whispersystems.org/docs/specifications/x3dh but our use case is quite different from theirs.

keks commented 7 years ago

Most of the previous work done with transport security considers a connection. In TLS with forward secrecy the keys for the connection are deleted - after the connection terminated.

In message-based communication there is not really a connection. Maybe there is a request-response style thing, but doing a new n-way handshake for sending one encrypted message and receiving one seems bad performance wise. I'm sure every protocol has their own answer to this situation (I don't know about DTLS or the Noise Framework and shs and cryptoauth haven't been analysed yet). Signal's answer is to have a very-long-running session and constantly update the key material to have forward secrecy within one session. Additionally they do a (piggy-backed) DH key exchange on each round trip, both to allow post-compromise security (some session state has been leaked and the security recovers) and to add further entropy to the session.

X3DH is mainly interesting when you have an intermediate server that stores ephemeral keys for you. When you have direct communication, I think you could probably also just use shs. That is not formally analysed yet, but is very very similar to the case without the OPK. Also, maybe formal verification is in the pipeline hint hint.

Kubuxu commented 7 years ago

The session exists as long as other party hasn't authenticated other session or till some timeout (long but session state is very small: shared key, nonce counter, sliding window).

Signal has very long running sessions because the messages are sent very rarely and there is option that the other party is offline, meaning it can't negotiate new session. This isn't a problem in our case as the other party has to be online, there is no server to act as mailbox.

Main use case of X3DH is establishing encrypted session when the other party is offline, otherwise it is quite easy to do normal DH, shs or cjdns alike.

keks commented 7 years ago

Signal has very long running sessions because the messages are sent very rarely and there is option that the other party is offline, meaning it can't negotiate new session.

Doing a 4-way handshake to do one request and get one response seems wasteful. If we talk to a host more often, the session can live really long. Also, the Signal state is also quite small (root key, chain key + some message keys for dropped messages, which can be deleted after some timeout). Speaking of dropped messages - that could be a reason against the double ratchet. While I believe having dropped packets should work, this has not been tested very much. This just doesn't happen in Signal, WhatsApp and OMEMO.

Kubuxu commented 7 years ago

The session exists as long as other party hasn't authenticated other session or till some timeout (long but session state is very small: shared key, nonce counter, sliding window).

UDP allows you to keep sessions quite long as there is no connection to keep alive. You don't have to do DH to send data, just use session symmetric key with new nonce, this is how cjdns and DTLS work.

Kubuxu commented 7 years ago

Something to consider too, is that the smaller the handshake packets the higher the probability that they will go through.

dominictarr commented 7 years ago

@jbenet I havn't shown it to him. Given my requirements for ssb, I don't personally have a need for an unreliable protocol, and have too much to do anyway.

@keks of course, the tradeoff with the long lived session is that you then need persistent state, so your network protocol needs access to disk also.

keks commented 7 years ago

The session exists as long as other party hasn't authenticated other session or after some timeout (long but session state is very small: shared key, nonce counter, sliding window).

Yeah but that's not less than Signal's state: A root key, a chain key and a sliding window of message keys.

the tradeoff with the long lived session is that you then need persistent state, so your network protocol needs access to disk also.

During a program run I'd keep it in memory and garbage collect if the list of open sessions grows too large. If one of the peers loses state, a new handshake can be initiated.

ghost commented 7 years ago

A talk and slides on WireGuard has recently be published:

My talk on WireGuard is finally online!

Video: https://www.youtube.com/watch?v=eYztYCbV_8U Slides: https://www.wireguard.io/talks/codeblue2016-slides-en.pdf

This has a general overview of NoiseIK in addition to WireGuard.

jbenet commented 7 years ago

@lgierth WG looks promising

Kubuxu commented 7 years ago

Looking at WireGuard's timestamp requirements it only needs monotonicity from the same initiator, is that still a problem?

Just finished reading the whitepaper on WireGuard [0]. Very interesting protocol, simple and still robust in security and DoS prevention.

I don't know how I feel about some higher layer decisions, like rekeying every 120s or other timeouts but this could be changed if it comes to it.

[0]: https://www.wireguard.io/papers/wireguard.pdf

ghost commented 7 years ago

@fjl let's chat at the meetup next monday?

fjl commented 7 years ago

Sounds good. I'll be there. Which meetup though?

ghost commented 7 years ago

@jfl oops didn't notice your edit -- today at the ethereum office in waldemarstrasse, at 7pm: https://www.meetup.com/Berlin-Ethereum-Meetup/events/236898828/

I'll be there by 9, we got the the IPFS sprint calls from 6 to 8.

fjl commented 7 years ago

I'm there, see you then.

ipfs / notes