discv5 packet signatures / handshake

fjl commented 5 years ago

In discv5, we aim to support multiple cryptosystems for node identity. These are called 'identity schemes' in the ENR EIP. In the v4 wire protocol all packets are signed by the node's secp256k1 identity key. If we can't assume secp256k1 identity anymore, how do we create and verify packet signatures?

A couple options below, please submit more:

We can leave off the signature most of the time if the 'conversation nonce' is strong enough. The signature would only be needed on packets that mess with conversation nonce like ping, iAm, ...
Specify something like a 'signature type' and define the verification such that it can be traced back to the node id. For the "v4" identity scheme (which is secp256k1/keccak256 as used in v4) this would be verify(sig, id) -> keccak256(recover(sig)) == id. It's unclear to me how well this works with ed25519 identities because recovering the public from the signature isn't a commonly exposed operation for this curve. If signature recovery isn't possible (it likely isn't), we'd need to know the sender node id for every packet to even verify the signature, something we could avoid so far. Maybe whoareyou/iAm packets can help with this.
Maybe sender node id and id scheme should be part of the packet header.
Defer the whole thing for later and require secp256k1.

fjl commented 5 years ago

Note that there is the additional requirement to avoid storing session state, i.e. state related to communication with individual nodes as much as possible. I'm trying to avoid establishing a 'session key' in whoareyou/iAm because we'd need to store it indefinitely or renegotiate all the time.

FrankSzendzielarz commented 5 years ago

On the above suggestions

I think leaving off the signature might allow for eavesdroppers to hijack a conversation. For example if the FindNode / WhoAreYou/ Iam/ Neighbours flow was eavesdropped, then the IAm could be faked.
I guess it's healthiest to avoid assumptions that pubkeys will always be recoverable. This means that one way or another the pubkey will need to be known.....
…..then yes but only for those message types that initiate a conversation, otherwise after that the conversation nonce can temporarily identify the node.
Last resort, IMO

So I think because the FindNode, Ping and then the topic discovery requests are the only messages currently able to start a conversation, they can contain the sender pubkey and scheme. If the sender is initially unknown to the recipient, the subsequent WhoAreYou will result in an IAm ENR that will need to be consistent with that initial message.

I think there will always need to be conversation state in the sense that messages in a conversation will need to be correlated. Eg: (FindNode->), (<-WhoAreYou?), (IAm->),(<-Neighbors) . WhoAreYou as a standalone call to an unknown node returns a large IAm and is therefore an amplification vector, which is why it must be encapsulated in the FindNode / TopicQuery conversation. I think it's also desirable to log 'conversations' as 'things' with their own rules and timeouts, and simple request-reply timeouts represent the same thing. Having said that though, there might be ways of implementing conversation nonce that involve some function of the previous message....

fjl commented 5 years ago

It's fine to keep temporary (conversation-scoped) state, just want to avoid a dependency on long-term state we must keep on disk.

fjl commented 5 years ago

Will add a note about this to requirements document when I'm done reformatting it.

FrankSzendzielarz commented 5 years ago

Sure? I can add a PR for it now. If you are in progress reformatting I will wait though.

fjl commented 5 years ago

After some thinking, here is a proposal for a signature-less scheme using one-time asymmetric encryption. This is vaguely similar to the RLPx handshake but uses simple key-exchange instead of ECDH:

When B wants to send a FINDNODE packet to A, it first checks whether it has secrets for conversation with A. If secrets are present, the packet uses those for authentication/encryption. Otherwise the first packet sent is TALKTOME. In either case A may ask for identification using WHOAREYOU.

A's WHOAREYOU includes a nonce value (id_nonce) to be signed by B and the highest known sequence number of B's record (enr_seq_b). B sends back an asymmetrically encrypted response (auth_reponse) containing mac_key, enc_key_up, enc_key_down as well as the current version of its node record if enr_seq_b is lower than the current sequence number. auth_response is encrypted to one of A's public keys listed in A's ENR.

A <-- B    TALKTOME [node_id_b]

A --> B    WHOAREYOU [enr_seq_b, id_nonce]

A <-- B    FINDNODE [[enc_scheme_name, auth_response, auth], sym_encrypt(enc_key_up, ...)]
   where enc_scheme_name = name of encryption scheme used for auth_response & further comms
         auth_response = encrypt(auth_data)
         auth_data = [id_nonce_sig, mac_key, enc_key_up, enc_key_down, record_of_b]
         id_nonce_sig = signature over id_nonce using identity scheme of record
         record_of_b = current ENR of B or empty list if enr_seq_b is current seq
         mac_key, enc_key_up, enc_key_down = random values
         auth = HMAC(mac_key, packet_hash)

A decrypts auth_response and checks id_nonce_sig against node_id_b using identity scheme of B's record. It can then authenticate/decrypt the actual packet.

TALKTOME and WHOAREYOU can be replayed. Replaying TALKTOME doesn't really buy the attacker anything:

WHOAREYOU is very small: amplification attacks are infeasible
id_nonce can't be signed without possession of the node key

Replaying WHOAREYOU cannot lead to impersonation because the attacker won't be able to decrypt the reply, but it can be used for DoS purposes. Maybe some of that risk could be avoided by transmitting another nonce in TALKTOME.

All packets following TALKTOME/WHOAREYOU from B to A are mutually authenticated and encrypted. They can be replay-protected using a simple sequence number.

A --> B    NEIGHBORS [auth, sym_encrypt(enc_key_down, ...)]
   where auth = HMAC(mac_key, packet_hash)

A <-- B    PING [auth, sym_encrypt(enc_key_up, ...)]
   where auth = HMAC(mac_key, packet_hash)
...

There are a couple nice things about this idea:

We can change encryption/authentication scheme at any time because they are identified by name. Support on A's side can be announced through ENR because B has to know A's record before attempting communication.
Initial encryption/authentication scheme can be HMAC + AES-CTR, but we could also go with AES-GCM which provides both and is HW-accelerated.
Obfuscation and all the complexity associated with it is not needed because we can use real encryption.
Either side can rekey at any time by sending WHOAREYOU

There are some downsides to this:

Asymmetric encryption is expensive.
We need to store and recall key material for every packet sent.
Need to be really careful about id_nonce because it shouldn't open up attack vectors where any data can be signed with node key.

FrankSzendzielarz commented 5 years ago

OK so I spent some time looking at this and right now it's too vague for me to get my head around: 1) TalkToMe advertises IP + node id combination to eavesdroppers or any kind of logger. 2) Still requires obfuscation as it is easy for censors or traffic monitors to look for this unobfuscated packet and then block all subsequent traffic 3) Needs several detailed explanations of differing scenarios. Eg: where is the correlator between TalkToMe and WhoAreYou messages? Is WhoAreYou a mandatory response to TalkToMe or not? 4) I have been under the impression that encryption for Discovery was something we wanted to avoid or make optional, as we've talking about that for literally months now and that was supposedly overkill, hence the entire Obfuscation use case 5) Why not DTLS or encrypted transports if we are going this route 6) I think we can and should explore this further why not, but I am getting increasingly conscious of the fact that solving the original issue is fairly trivial and that time is passing by....., but in its current presentation it seems that what you are proposing is a radically different protocol that needs another round of specification, review and discussion.

My response is that I think because you are implementing a prototype of this and refining your spec, let's wait until you have this in a more concrete form and review next week. In the meantime I would consider the original solution of just adding the key/scheme in conversation-starting messages, and then pitch that against your alternative protocol when it's clearer.

fjl commented 5 years ago

I have implemented this to confirm that the idea is viable. It works, but several details still need work. Let me describe what I have and how it influences the spec so far:

Encoding of TALKTOME:

message          = magic || [src-node-id]
magic            = sha256("TALKTOME" || dest-node-id)

Encoding of WHOAREYOU:

message          = magic || [src-node-id, id-nonce, enr-seq]
magic            = sha256("WHOAREYOU" || dest-node-id)

All other messages:

message          = src-node-id || message-auth || message-body
message-body     = encrypt_aesgcm(write-key, ptype || rlp(message))
message-auth     = {auth-tag, [auth-tag, auth-scheme-name, auth-response]}
id-tag           = identifier assigned in handshake
auth-tag         = AES-GCM nonce
auth-scheme-name = "gcm"
auth-response    = ecies_encrypt(dest-node-pubkey, [id-nonce-sig, record, read-key, write-key])
id-nonce-sig     = sign(sha256("discovery-id-nonce" || id-nonce))

The handshake works as described above (scenario: B wants to talk to A)

B sends TALKTOME
A sends WHOAREYOU including id-nonce challenge
B sends its first real message (e.g. FINDNODE) and includes encrypted auth-response in the header.
A first verifies auth-response, then auth-tag and then replies to the message if it's valid.

About the spec changes needed: The spec would include the new packets and a longer description of the handshake. The asymmetric encryption scheme is already described in rlpx.md, we could just link it. Info about obfuscation can be removed from the spec. The IAM packet would also be removed.

Addressing your concerns one by one:

TalkToMe advertises IP + node id combination to eavesdroppers or any kind of logger.

I'm still trying to find a solution, but communication must include the source node ID in order to be readable by the recipient. My solution above doesn't address this yet, but I'm aware I need to look into this more.

Still requires obfuscation as it is easy for censors or traffic monitors to look for this unobfuscated packet and then block all subsequent traffic.

This is resolved now because plaintext packets have a unique prefix per destination node.

Needs several detailed explanations of differing scenarios. Eg: where is the correlator between TalkToMe and WhoAreYou messages? Is WhoAreYou a mandatory response to TalkToMe or not?

Yes, in this scheme WHOAREYOU is a mandatory response to TALKTOME. Correlating the two is simple because there cannot be any other communication between the to participants until the handshake is over.

I have been under the impression that encryption for Discovery was something we wanted to avoid or make optional, as we've talking about that for literally months now and that was supposedly overkill, hence the entire Obfuscation use case.

It depends on how cheap it is. I would certainly prefer standard encryption over a weird custom obfuscation scheme if the encryption is light on resources.

Since AES-GCM is a lot cheaper than verifying an ECDSA signature, packet authentication using this scheme is more efficient than v4. The asymmetric encryption step is more expensive than signature verification though. This means the remaining question is how quickly we can offset this one time cost. I will determine this with a benchmark.

Why not DTLS or encrypted transports if we are going this route

If you take a deeper look at DTLS, you'll find that it is very complicated and not widely used/implemented.

I think we can and should explore this further why not, but I am getting increasingly conscious of the fact that solving the original issue is fairly trivial and that time is passing by.....

I can understand you are getting impatient. It has taken me one week to explore this idea and I think it is worth thinking this scheme through because it solves packet authentication, encryption/obfuscation in a fundamental way, with minimum number of round-trips needed. It also bakes ENR exchange into the protocol in a way that makes sharing your record mandatory if you want to talk to anyone.

in its current presentation it seems that what you are proposing is a radically different protocol that needs another round of specification, review and discussion.

I was never really certain about how the whole WHOAREYOU / IAM flow would work. This scheme is my attempt to define how it can work. (It turns out IAM is not needed)

FrankSzendzielarz commented 5 years ago

Maybe the TalkToMe can just be XORd with the destid. If the destination node is trying to evade DPI, then once its own node id is identified at that IP, it is already compromised.

FrankSzendzielarz commented 5 years ago

Also, can we please try to still incorporate a byte or something over the encrypted channel describing the conversation? I do think that this might still evolve to allow multiple concurrent communications between nodes and some conversation correlator allows for extensibility.

fjl commented 5 years ago

Yes, we still need a request/response correlator.

ethereum / devp2p

discv5 packet signatures / handshake #60