zcash / zips

Zcash Improvement Proposals
https://zips.z.cash
MIT License
274 stars 156 forks source link

[protocol spec] Resolve TODOs in GroupHash/Unlinkability Analysis #202

Open defuse opened 5 years ago

defuse commented 5 years ago

(Copying this comment from the other ticket where I was doing my audit).

Checking the "TODO: check whether this is justified" in the unlinkability proof for DiversifyHash:

Let (P, pk) be a diversified address with secret key x, i.e. pk = [x]P. Generating a new diversified address for the same secret key but a random distinct diversifier gives (Q, qk) where qk = [x]Q and, making the random oracle assumption about GroupHash, Q=[y]P for some random y.

Looking at ElGamal, we can suppose Alice chose (P, pk) as her public key. Now, when Bob encrypts a message corresponding to OJ, he chooses a random y and outputs [y]P and [y][x]P = [yx]P = [x][y]P. The distribution of Bob's output, [x][y]P and [y]P for a random y, is the same as the distribution of generating a new diversified address, qk = [x][y]P and Q=[y]P for a random y. So that part is good.

Let's see if I can make the reduction to ElGamal key privacy explicit. I'm not confident this is correct, and see the caveat below.

Consider the following experiments, for adversaries A and bit b.

Experiment ExpAun-b:

  1. (P1, pk1, psk1) <- NewDiversifiedAddress()
  2. (P2, pk2, psk2) <- NewDiversifiedAddress()
  3. (Q, qk) <- Diversify(Pb, pskb).
  4. Return A(P1, pk1, P1, pk2, Q, qk).

Experiment ExpAik-tpa-b:

  1. (P1, pk1, psk1) <- GenerateElGamal()
  2. (P2, pk2, psk2) <- GenerateElGamal()
  3. (Q, qk) <- ElGamalEncryptTrivialMessage(pkb).
  4. Return A(P1, pk1, P1, pk2, Q, qk).

...where....

Define IK-TPA ("T" stands for "trivial") and AdvAik-tpa similarly to IK-CPA . Define Unlinkability and AdvAun similarly to IK-CPA as well.

Obviously IK-CPA (instantiated with ElGamal) implies IK-TPA so AdvAik-tpa is negligible for all relevant adversaries A. We want to argue that AdvAun is negligible for the same set of adversaries. Let A be an adversary from that set and let b be any bit. It's true that Pr[ExpAun-b = 1] = Pr[ExpAik-tpa-b = 1], since...

So AdvAun = AdvAik-tpa. So IK-CPA for ElGamal implies Unlinkability.

However, as noted in the spec, Unlinkability as defined here might not be the right property. For example, if you're given ~244 diversified addresses and are promised that they are all either generated independently or generated by randomly diversifying one single address, you can tell which is the case since you expect collisions in the latter case but not the former. This problem shows up in the definitions above where we required Diversify to make sure it doesn't select the same diversifier by accident. If Diversify just selected randomly, then the distributions over (Q, qk) would be different since Pr[qk = pkb] = 2-88 in ExpAun-b and much smaller in ExpAik-tpa, so the argument's success would depend on whether 2-88 still counts as negligible.

The spec recommends generating diversifiers randomly, and in practice people will just generate them randomly and publish them expecting them all not to be linkable, so it might be better to recommend generating no more than ~220 diversified addresses from the same secret key (to keep the probability of collisions low), and define security by giving the adversary two sets of 220 diversifications... or something. I'll have to think about this more.

(*) The Wikipedia page for ElGamal doesn't say what distribution Alice samples her generator from, so I'm assuming it doesn't affect the security properties of ElGamal if Alice chooses her generator the same way a diversified base is chosen when generating a new address.

daira commented 5 years ago

@defuse wrote:

For example, if you're given ~244 diversified addresses and are promised that they are all either generated independently or generated by randomly diversifying one single address, you can tell which is the case since you expect collisions in the latter case but not the former.

No, you expect diversifier collisions in the former case but not the latter (if you're generating the latter via a permutation as specified in ZIP 32). You don't expect pkd collisions in either case.