cfrg / draft-irtf-cfrg-vdaf

VDAF specification
Other
20 stars 15 forks source link

Bind application context to VDAF computations #418

Closed cjpatton closed 2 months ago

cjpatton commented 2 months ago

Consider the risk that, for a specific variant of Prio3, an attacker does enough offline computation that it can efficiently find invalid reports that get accepted by the aggregators. We know this attack is feasible for a sufficiently small finite field and/or number of proofs: when the circuit uses joint randomness, the attacker can search for input shares that derive joint randomness that forces the circuit to accept an invalid measurement. (See #311 for details.)

While we believe we've mitigated this risk for standardized variants, like Prio3SumVec, there is still a possibility of non-standardized variants getting deployed that haven't. The draft should, at a minimum, acknowledge this risk. We could also modify the spec in order to reduce the blast radius of this attack as much as possible.

One idea that's been suggested (by @junyechen1996 and @kunal-talwar) is to add the DAP task ID to the transcript hash for the joint randomness derivation. This way all of the offline computation carried out by the attacker is only useful for a specific DAP task. This is similar to salting a password hash in order to ensure that whatever computation the attacker does in advance, like building rainbow tables, is specific to that user the attacker wants to impersonate.

More generally, we could modify the VDAF syntax in order to allow binding the computation to its application context. There are a couple ways we can do this:

  1. Add a context parameter to (V)DAF sharding and preparation.
  2. Allow the nonce to encode the context. At the moment, the nonce size is determined by the (V)DAF, and the client is supposed to choose it at random; we would need modify the syntax to support any nonce size and relax the requirements for choosing it.

Both of these options will require about the same amount of work for the DAP and VDAF specs. The downside of (1.) is that it requires API breaking changes. The downside of (2.) is that it makes the requirements more technical: the nonce now needs to have high min-entropy rather than being completely random. We could probably address this by providing some examples of secure ways to choose the nonce.

Do folks support this change?

Some reasons to do this:

  1. Defense-in-depth is always good (and this change certainly can't hurt).
  2. This aligns with a design philosophy of more conventional ZKP applications, which is to add as much context as possible to the transcript whenever you do Fiat-Shamir.

Some reasons not to do this:

  1. It's a fairly big change, and there's a chance we get it wrong. Though we at least have a pretty good idea of what to do for Prio3.
  2. It could create a false sense of security, leading to overly optimistic parameter choices getting deployed.
junyechen1996 commented 2 months ago

Thanks @cjpatton.

I think the current VDAF robustness is assuming a particular VDAF cannot be attacked during the lifetime of it. To attack the joint randomness, an attacker actually controls all the components that go into the hashing, which includes nonce, secret shares of the input share, VDAF ID. In theory, an attacker can use a fairly long time and make enough random oracle queries to find the joint randomness that passes the circuit verification. Additionally, if the attacker finds the joint randomness, it can keep using the same combination to attack a new task (nonce can be reused in different tasks in DAP), or even publish the attack vectors so others can all use these to attack the VDAF in new tasks.

However, binding application context like taskID has the following advantages:

I agree this is defense-in-depth, but this concern does apply to all the VDAFs that use joint randomness, including the standard VDAFs, it's just a matter of how long and how many random oracle queries to make to find the attack vectors.