Use an IK-CCA encryption scheme

AntoineRondelet commented 5 years ago

For now we are using RSA-OAEP to encrypt the notes. That's fine for the PoC, but we will need to fix that sometimes and switch to an IK-CCA scheme.

Some reading on the topic: https://iacr.org/archive/asiacrypt2001/22480568.pdf and https://github.com/zcash/zcash/issues/558

HarryR commented 5 years ago

What about NaCl or libsodium?

rrtoledo commented 5 years ago

You mean DH encryption on curve Curve25519 (X22519)?

HarryR commented 5 years ago

Lets break this down and make it a useful rather than a vague ticket, because I think this ticket is lacking details (a lot of tickets on Clearmatics repos seem to be lacking details), so lets add some of these then.

There are two arbitrary length ciphertext fields used in the Groth16 Mixer contract:

ciphertext1
ciphertext2

These are encrypted using PKCS#1 OAEP by splitting the input into n*62 byte chunks outputting n*128 byte ciphertexts e.g. at: https://github.com/clearmatics/zeth/blob/e0b4ed4403962fb6fbbabe3c8d4678317865bf88/pyClient/zethUtils.py#L35

These are inputs encoded as base64 strings, to the mix function: https://github.com/clearmatics/zeth/blob/e0b4ed4403962fb6fbbabe3c8d4678317865bf88/zeth-contracts/contracts/Groth16Mixer.sol#L17

These are then emitted as events: https://github.com/clearmatics/zeth/blob/e0b4ed4403962fb6fbbabe3c8d4678317865bf88/zeth-contracts/contracts/BaseMixer.sol#L184

The values are JSON blobs which contain enough info for the recipient to spend the coin, e.g.

  return {
    aPK: recipientApk,
    value: value,
    rho: randomness.rho,
    trapR: randomness.trapR
  };

Initial thoughts:

There is no verifiable encryption, nor is the value of the note JSON bound in any way to the zkSNARK circuit, it's possible to encrypt arbitrary crap
JSON... why?
RSA-OAEP... why?
Base64 encoding of data when we have a binary safe protocol... why?
1024-bit RSA keys... why?
etc. etc.

The values of ciphertext1 and ciphertext2 are absolutely essential for the recipient(s) to spend the notes, but the RSA keys seem to be entirely separate to the protocol - just used so the recipient can decrypt the result to recover the spending keys/entropy for the note.

The ciphertexts aren't in a SNARK friendly format (Base64, JSON, hex or UTF-8 encoded strings in the JSON etc.).

There is no verifiable encryption of the the ciphertext, this introduces malleability in the protocol - in the sense that neither the SNARK circuit nor the smart contracts verify that the ciphertext fields are legitimately encrypted and will be accessible by the recipient.

I think there are really 3 problems here, which are more systemic than just a choice of encryption scheme.

1) Verifiable encryption, which only the recipient can open, of the information the recipient needs to spend the outputs. 2) Using RSA keys for out-of-band communication 3) Using JSON and Base64 encoding, which probably isn't necessary and isn't friendly for SNARKs or Solidity contracts.

If we start looking into a scheme which solves these problems you end up with something closer to the original ZCash Sprout circuit, and if you do it will you'll end up with something much closer to the ZCash Sapling circuit.

Is this worth pursuing?

AntoineRondelet commented 5 years ago

Hi Harry, thanks for your comment!

Many things here:

There is no verifiable encryption, nor is the value of the note JSON bound in any way to the zkSNARK circuit, it's possible to encrypt arbitrary crap

There is no verifiable encryption of the the ciphertext, this introduces malleability in the protocol

Exact! This is why I always put the emphasis on the fact that the current version of zeth is a PoC and nowhere close to being a product. Furthermore, this issue is:

mentioned in the paper
mentioned in the issue: https://github.com/clearmatics/zeth/issues/7

JSON... why? RSA-OAEP... why? Base64 encoding of data when we have a binary safe protocol... why? 1024-bit RSA keys... why?

No specific reason here. The current state of the repo showcases a proof of concept, so we can use whatever encryption scheme. I picked RSA arbitrarily (and used small keys for the PoC). This issue is here especially because we know that RSA is definitely not a good choice (alright for PoCing but not for anything serious), hence why we need to switch to another scheme, which btw needs to be IK-CCA as mentioned in the issue. I also chose the format (json), encoding and so on completely arbitrarily so we could get a PoC out. The purpose here is to propose a protocol, now we need to refine all of this and polish everything, but we're not there yet. We'll come to it as we make progress.

And btw

I think this ticket is lacking details

That's true indeed. I opened a bunch of tickets a while ago so we have a "Todo list" and a bunch of issues we know need to be tackled somewhere. Further details wouldn't be a luxury though I agree; especially if people want to contribute - which would be very cool. Thanks for the feedback.

HarryR commented 5 years ago

Ok, let met write this down, because I think it'll be helpful to explore the problem. <crypto_rambling>

We want to make the protocol 'rigid', which means that there is no way for the protocol to be altered by any party in a way which deviates from 'correct' - where correctness implies that all of the things we could have considered have been accounted for. This is why IK-CCA is very important right? But, I'm having a hard time with these acronyms, e.g. there are many more than just:

indistinguishability of keys against adaptive chosen ciphertext attacks (IK-CCA)
indistinguishability against adaptive chosen ciphertext attacks (IND-CCA)

So, lets focus on something more intuitive and we can figure out which terms apply layer? What matters is making it impossible for an honest participant to stray from 'the protocol', and a malicious participant would have to break some underlying algorithm such as finding a pre-image to a hash or the discrete-log of a group element, meaning we end up relying on the security proofs of the fundamentals with no way to bypass them.

The main problem at the moment are the two fields, or N fields if we want to generalise, which contain cipher-text encrypted only for the recipient, that allows them to spend the coin. But, is this really a problem with the current protocol... It allows somebody to burn their own cash, but does it convince the recipient that they could spend it?

You can encrypt false data for the recipient, and they can successfully decrypt it, but it may not let them spend the coin - not just that but because it's an essentially arbitrary string you could potentially crash their client or make it do unintended things because the transfer of information between the sender and receiver is an implicit protocol.

However, there are advantages to this approach, which is it allows the algorithms for the implicit protocol to change and adapt, nor does it force you to implement crypto algorithms in the significantly more difficult to change zkSNARK circuit.

But, we do need to certify that the coin is spendable by the recipient, regardless of however the information they need is communicated to them, and we assume that however they receive that information doesn't open them up to exploits or vulnerabilities.

I guess, next, I need to look at whether or not it's possible to make a recipient think a coin you've sent them is spendable, but in reality it's actually not (or, is impossible). I'll get onto that in a bit.

</crypto_rambling>

TL;DR having the out-of-band communication use an arbitrary protocol is not a bad thing, it could be more efficient compared to the current choices (which is literally just a case of using shorter keys, like ed25519, and bit-packing the fields so they take up less space on the blockchain), but I think this is a good balance and next need to focus on the more in-depth side of the circuit.

HarryR commented 5 years ago

So we have some problems, lets look at disputes which could arise:

Proof that I could not decrypt the payment ciphertext
Proof that I could decrypt the payment ciphertext, but I couldn't spend it
Proof that I encrypted it correctly for you, and you're lying
Proof that I constructred the coin correctly, and encrypted it correctly for you, and you're lying

Can you think of any more types of disputes?

I think the protocol should handle these, and be able to prove (with zero-knowledge) that we've done what we should have and are 'honestly' following the protocol.

AntoineRondelet commented 5 years ago

I am not sure I follow this part:

We want to make the protocol 'rigid', which means that there is no way for the protocol to be altered by any party in a way which deviates from 'correct' - where correctness implies that all of the things we could have considered have been accounted for. This is why IK-CCA is very important right?

IK-CCA (or "key private") is necessary here to make sure that ciphertexts cannot be linked to the public key used to encrypt them (the one of the recipient of the payment), or to other ciphertexts which were produced with the same public key. This basically aims to "keep the relationship between sender and recipient hidden".

The main problem at the moment are the two fields, or N fields if we want to generalise, which contain cipher-text encrypted only for the recipient, that allows them to spend the coin. But, is this really a problem with the current protocol... It allows somebody to burn their own cash, but does it convince the recipient that they could spend it?

This is only one step in the receiving process. (See section 3.4.5 Payment reception https://arxiv.org/pdf/1904.00905.pdf) In addition to "successfully decrypt a ciphertext", Alice needs to make sure that the agreement she had with Bob has been respected (Bob could just send 1ETH to Alice while they agreed that Bob would send 10ETH for instance). To that end, once the plaintexts are recovered, Alice needs to recompute the associated commitments and check that the recovered commitments c'_i actually matches some of the commitments that had been appended by Bob in his tx + Alice need to check that the sum of the values of the recovered notes equals the value Bob was supposed to transfer to Alice. If so, the payment is accepted, else, a dispute needs to happen.

You're right, we need to have a way to solve these disputes. This is a bit tricky as we don't want to have anything related to Alice and Bob's agreement on chain (it would leak all the data we want to hide). Being aware of that, we briefly mentioned the need for a dispute mechanism in section 4, paragraph If decryption of zeth notes fails. of the paper. Here's an extract of this paragraph:

In case Bob receives a payment that is not correct or just fails to receive any payment from Alice, we propose to use a dispute mechanism coupled with a reputation system (again, out of scope of this paper) which could be a way to stop malicious senders. In such scenario, Bob could make a zero-knowledge proof that no ciphertext broadcast by Mxr matched the contract he had with Alice. In that case, Alice’s reputation on the system could be severely damaged and may result in fewer users willing to transact with her

We could probably extend the statement to fit our needs. I'd need to think about that (but that's another issue which is not strictly related to IK-CCA encryption. I'm happy to discuss this in another issue dedicated to discuss about the dispute mechanism)

AntoineRondelet commented 5 years ago

Closing as the corresponding PR has been merged

AntoineRondelet commented 4 years ago

Reopening this issue as we need to have both the shared secret as well as the epk as input for the KDF (cc: @riemann89)

dtebbs commented 4 years ago

Reopening this issue as we need to have both the shared secret as well as the epk as input for the KDF (cc: @riemann89)

Not sure if I've understood this correctly, but I recently happened to be looking into exactly how the key was derived. Just giving these links in case it's relevant: Client uses Box, defined here: https://github.com/pyca/pynacl/blob/master/src/nacl/public.py#L179 which appears to end up here: https://github.com/jedisct1/libsodium/blob/master/src/libsodium/crypto_box/curve25519xsalsa20poly1305/box_curve25519xsalsa20poly1305.c#L35

riemann89 commented 4 years ago

When the shared secret is generated by the KDF (hsalsa20 in this case) it makes use only of the secret given by sk*pk. More specifically it encrypts under the key sk*pk a zero vector s of size 32 bytes.

See:

https://github.com/jedisct1/libsodium/blob/master/src/libsodium/crypto_box/curve25519xsalsa20poly1305/box_curve25519xsalsa20poly1305.c#L45

and

https://github.com/jedisct1/libsodium/blob/master/src/libsodium/crypto_core/hsalsa20/ref2/core_hsalsa20_ref2.c#L17

To avoid problem showed in Section 3.1 of DHAES paper we need to pass also pk as parameter of the KDF.

dtebbs commented 4 years ago

@riemann89 Thanks for the clarification

clearmatics / zeth

Use an IK-CCA encryption scheme #2