Deterministic square roots in the circuit

ebfull commented 4 years ago

Our circuit is over a field Fp where p - 1 = 2^k * t with t odd and integer k > 0. (k is often around 32 in practice.) Let a be a nonzero square. The prover will witness b such that b^2 = a. We wish to force the prover to witness b rather than -b w.l.o.g. by distinguishing between them deterministically. The standard trick is to perform a boolean decomposition of b and see if e.g. the parity bit is zero, or perhaps check that b < (p - 1) / 2 or something. Boolean decompositions of full field elements are needlessly expensive in the circuit though; here's a technique that allows us to reduce this cost.

Let's rewrite b as b = c * alpha^d where c is in the multiplicative order t subgroup of Fp*, alpha is an element of multiplicative order 2^k, and d is an integer modulo 2^k. There is a unique solution for (c, d) for a nonzero b that can be obtained by the prover efficiently for smallish values of k. In order to distinguish b from -b observe that the most significant digit of d is set only for one of the possible values of the square root. Thus we can have the prover witness k - 1 booleans representing d (constraining them to the choice of b where the most significant digit of d is not set) and compute alpha^d in the circuit using k - 2 conditional multiplications by precomputed powers of alpha. It then suffices to see that c is in the multiplicative order t subgroup of Fp* to see that the choice of b is the only choice available to the prover for the given square a. This can be done by having the prover witness e = c^{{2^k}^-1 mod t} and computing c = e^{2^k} in the circuit directly with k squarings to "kill" the 2^k order subgroup. Finally, we check that b = c * alpha^d and b * b = a to establish with 3k - 1 multiplication constraints that b is a deterministic square root of a. Since k is usually around 32 this is quite a bit more efficient (less than 100 constraints) than a boolean decomposition of the full field element (around log_2(p) constraints).

It's possible that plookup helps here too, I haven't really thought about it. But if plookup isn't available in your setting, this seems like a decent alternative.

ebfull commented 4 years ago

Note that in settings like Bulletproofs over Ristretto255, where your k is like 2, this is a super efficient technique compared to previous approaches.

daira commented 4 years ago

@ebfull wrote:

It's possible that plookup helps here too, I haven't really thought about it. But if plookup isn't available in your setting, this seems like a decent alternative.

I'm not sure the deterministic square root method pays off when Plookup is available, unless you use a table lookup for α^d. Let's assume that we have a 2¹¹-size table available for some other reason — it can be a table representing any function of {0..2¹¹-1} to some arbitrary codomain. This is sufficient to do an 11-bit range check in one row. But if we want to use lookups for α^d, we would need an extra, specialized table for that purpose.

Even without a table for α^d, you can probably do two bits of d per row for that computation, and you can merge the boolean (actually base-4) checks for d with that. Also you can do the squarings for e^{2^k} more efficiently in PLONK; you can probably do 3 or 4 squarings per row (let's say 4). So that's ~16 rows for α^d and another 8 for e^{2^k}, which is 24. But with an 11-bit table you only need 23 rows to do a 253-bit boolean decomposition.

In other words, if we're able to use table lookups for the boolean decomposition but don't have a specialized table to compute α^d in the deterministic square root method, then the latter only just about breaks even (and is more complex).

Note that @dlubarov also concluded in https://github.com/mir-protocol/plonky/pull/7 that it didn't pay off for them, for similar reasons.

zcash / zcash

Deterministic square roots in the circuit #4421