cfrg / draft-irtf-cfrg-vdaf

VDAF specification
Other
18 stars 14 forks source link

Threat model and protocol for distributed setup #18

Closed cjpatton closed 1 year ago

cjpatton commented 2 years ago

The "setup" algorithm defined in the spec specifies the generation of the public parameter used by the clients and verification parameter of each of the aggregators. This algorithm is said to run "out-of-band", effectively punting this consideration to applications. In particular, this is a problem that PPM will have to solve: see https://github.com/abetterinternet/ppm-specification/issues/161.

We know that the setup algorithm being run securely is crucial correctness, however it's not clear how important it is for privacy. As a strawman, suppose that we have one aggregator run the setup algorithm and distribute the output among the parties such that the clients never learn or control the value of the verification parameters. This ought to be good enough for correctness. What's less clear (to me, at least) is whether the aggregator, by choosing the verification parameters maliciously, can force an honest aggregator into revealing something that leads to a privacy violation.

If this strawman is not sufficient and we end up needing to do something fancier, then it may prudent to revisit the "setup algorithm" abstraction. What we woiuld want for prio3 and poplar1 is a protocol for exchanging a shared secret. (The properties we need from this key exchange are TBD, but note that something like a DH key exchange might not suffice. See https://github.com/abetterinternet/ppm-specification/issues/161 for discussion.)

schoppmp commented 2 years ago

One thing to keep in mind is that the randomness used in the verification protocol (at least for Poplar) must be independent of the client keys. If a (semi-honest) helper server colludes with a (malicious) client, the client could use the randomness known to the helper server to generate an invalid share that still passes the verification. So I would propose running the setup phase only after all client shares have been collected and agreed on by the helper servers.

Regarding maliciously generated randomness, are we considering malicious security for the helpers at all? If so, what exact formalization? For example, Poplar only provides malicious security with respect to privacy (but not correctness). Not sure we want to hard-code this threat model or instead leave it up to implementations.

cjpatton commented 2 years ago

One thing to keep in mind is that the randomness used in the verification protocol (at least for Poplar) must be independent of the client keys. If a (semi-honest) helper server colludes with a (malicious) client, the client could use the randomness known to the helper server to generate an invalid share that still passes the verification.

Most definitely. In fact, this is likely going to be true for most, if not all, VDAFs.

So I would propose running the setup phase only after all client shares have been collected and agreed on by the helper servers.

This isn't going to feasible for all deployments for PPM. In particular, a deployment of prio3 might throw report shares away as soon as they're aggregated, in order to limit storage overhead. Regardless, I think the main requirement is that the randomness is kept secret from the clients, which is definitely feasible.

Regarding maliciously generated randomness, are we considering malicious security for the helpers at all? If so, what exact formalization? For example, Poplar only provides malicious security with respect to privacy (but not correctness). Not sure we want to hard-code this threat model or instead leave it up to implementations.

Yes, as stated in our security considerations: for privacy we're concerned with malicious aggregators, but for correctness we assume the aggregators are honest. This amounts to the clients only needing to trust at least one aggregator for privacy, however the collector needs to trust that the aggregators compute the protocol correctly.

The main point raised in https://github.com/abetterinternet/ppm-specification/issues/161 is that there is a gap in the thread model not addressed by [BBCGGI19] or [BBCGGI21] (Poplar): The shared randomness used for verification may be controlled by the attacker. [BBCGGI19] and [BBCGGI21] side-step this by assuming an ideal "coin-flipping" functionality in the protocol. In practice, this coin-flipping functionality needs to be realized by some interaction among the aggregators.

Thus the main question I'm asking here is what properties do we actually need from the coin-flipping protocol. In particular: If we allow the attacker to control the honest aggregators' long lived randomness, are poplar1 and prio3 still private? This was discussed a bit in https://github.com/abetterinternet/ppm-specification/issues/161, but I'm not sure we have a definitive answer (at least for prio3).

schoppmp commented 2 years ago

I think there are two questions here:

  1. How do we handle collusions between one of the servers and a subset of clients with respect to correctness?
  2. (Your question) Does a maliciously chosen randomness affect the privacy of the protocols?

Regarding (1): Even though we don't care about correctness for malicious server behavior, a semi-honest server colluding with malicious clients might still be a problem. For example, in reality, we could enforce semi-honest helper servers using some form of binary attestation. But that wouldn't stop a malicious admin from reading out the verification randomness and then running a client with an invalid input that still passes verification with that randomness.

Regardless, I think the main requirement is that the randomness is kept secret from the clients, which is definitely feasible.

Given the example above, I'm not sure it is always realistic to assume that the randomness is kept secret from all clients. So I would like to at least keep the option to only generate the randomness once the client shares are agreed on. If we want to allow Fiat-Shamir as discussed in abetterinternet/ppm-specification#161, this would be required anyway.

Regarding (2): I had a brief look at Poplar, and it doesn't seem like adversary-chosen randomness can break privacy. In particular, Proposition 2 in Appendix C.4.1 still holds (the simulator does not require r to be random). Maybe @henrycg has some insights for Prio?

cjpatton commented 2 years ago

But that wouldn't stop a malicious admin from reading out the verification randomness and then running a client with an invalid input that still passes verification with that randomness.

Since this involves sharing information with the client -- or, equivalently, acting as a client itself -- I would consider this malicious (i.e., active) behavior on the part of the server and, therefore, out of scope.

So I would like to at least keep the option to only generate the randomness once the client shares are agreed on.

I think keeping this option is reasonable. One question though: Is this a matter to be considered by the VDAF spec or by applications? We could discuss the attack you described above in security considerations and recommend setting up the verification parameters only after the input shares have been ingested.

If we want to allow Fiat-Shamir as discussed in abetterinternet/ppm-specification#161, this would be required anyway.

Can you provide more detail here? We use Fiat-Shamir in prio3 for the "joint randomness", but not the "query randomness". I agree that there's an attack if the client knows the shared secret used to derive query randomness, but I don't see an attack if the shared secret is generated before the client generates its input shares (as long as it doesn't see it). Am I missing something?

henrycg commented 2 years ago

Regarding (2): I had a brief look at Poplar, and it doesn't seem like adversary-chosen randomness can break privacy. In particular, Proposition 2 in Appendix C.4.1 still holds (the simulator does not require r to be random). Maybe @henrycg has some insights for Prio?

Yes, the same should be true for Prio, provided that the servers' random challenge is chosen, as in the Crypto'19 paper, to avoid the certain set of "bad" challenge points.

schoppmp commented 2 years ago

Can you provide more detail here? We use Fiat-Shamir in prio3 for the "joint randomness", but not the "query randomness". I agree that there's an attack if the client knows the shared secret used to derive query randomness, but I don't see an attack if the shared secret is generated before the client generates its input shares (as long as it doesn't see it). Am I missing something?

Okay, so this means that you use fiat-shamir to derive the randomness used in each client's proof share, but not the randomness that is sampled independently by the servers using coin-flipping? In that case you are right, we don't need to wait until we know all client shares.

Still, the issue regarding a server colluding with a malicious client stands. I'm not sure we can simply discount it for being "out of scope", given the example attack scenario I gave above. Or, to put it in more formal terms, I don't see why we should require the adversary to corrupt all parties in the same way, given that some (clients) are more easy to corrupt maliciously than others (helper servers).

schoppmp commented 2 years ago

Is this a matter to be considered by the VDAF spec or by applications? We could discuss the attack you described above in security considerations and recommend setting up the verification parameters only after the input shares have been ingested.

Do you think we should discuss this in the PPM spec as well (e.g. in abetterinternet/ppm-specification#161)?

cjpatton commented 2 years ago

That's an interesting point, and I agree that there's no reason to think of all parties as being corrupted in the same way. Definitely worth thinking about.

I'll just add this thought: If a malicious aggregator wants to break correctness, there's a much simpler attack than corrupting or acting as a client: all it has to do is make up a bogus aggregate share. The collector would have no way to detect this, at least for prio3 or poplar1.

schoppmp commented 2 years ago

I'll just add this thought: If a malicious aggregator wants to break correctness, there's a much simpler attack than corrupting or acting as a client: all it has to do is make up a bogus aggregate share. The collector would have no way to detect this, at least for prio3 or poplar1.

I assume you mean a malicious admin or similar insider would run this simpler attack at the aggregator service? Then it wouldn't work any more if there are additional measurements (e.g., binary attestation) to enforce semi-honest behavior of the aggregator process, whereas reading out the verification randomness and then uwing it with a maliciously crafted client message would still work.

Overall, even if we only consider outside attackers, I believe leaking some secret value from one of the aggregator servers is easier (requires less severe vulnerabilities or exploits) than triggering a remote code execution to cheat during the aggregation protocol.

cjpatton commented 2 years ago

I assume you mean a malicious admin or similar insider would run this simpler attack at the aggregator service? Then it wouldn't work any more if there are additional measurements (e.g., binary attestation) to enforce semi-honest behavior of the aggregator process, whereas reading out the verification randomness and then uwing it with a maliciously crafted client message would still work.

Binary attestation would be super useful, but I don't think it will be used in all applications of VDAFs.

Regardless, I think the threat you're describing is worth discussing in security considerations, and we could even RECOMMEND that, where feasible, the setup algorithm be run only after the batch of input shares has arrived. However I would not go so far as to say deployments MUST or even SHOULD do this. Do you think this would be a reasonable outcome?

schoppmp commented 2 years ago

That sounds good to me. I opened #21 for that, PTAL.

cjpatton commented 1 year ago

The recommendation from this analysis is that we can leave it up to the application to decide how to pick the verification key, as long as it is picked prior to reports being processed. We will need to revise the text in security considerations (see https://github.com/cfrg/draft-irtf-cfrg-vdaf/issues/140), but I think we can close this issue.