celo-org / celo-bls-snark-rs

Implements SNARK-friendly BLS signatures
https://celo.org
Apache License 2.0
83 stars 24 forks source link

Batch verification #222

Closed gtank closed 2 years ago

gtank commented 2 years ago

Description

This adds batch verification (not just aggregate screening) that guarantees each individual signature submitted was valid, up to a security bound. Currently the only mode exposed to FFI is a single large batch verification but the underlying code should be adaptable to streaming if we solve the question of where the state accumulation should live / efficient cgo calls.

Tested

Minimal functional test is included in this PR. Full test of the FFI occurs in the Go module.

Other changes

This also exposes the constructor for PrivateKey type at the crate level. See relevant commit.

Related issues

213 #214 #217

psivesely commented 2 years ago

Besides in-line comments, one more suggestion: Batch could contain a field partial_hash and entries could be Vec<(PublicKey, Signature, Exponent)>. new could set partial_hash to

Params::new()
    .personal(b"bvblssig")
    .hash_length(32)
    .to_state()
    .update(message)
    .update(extra_data)

and add could then set Exponent to Fr::from_random_bytes(partial_hash.clone().update(public_key).update(signature).finalize()). This way you don't have to keep re-hashing the message and this may also be more amenable to future stream-ification.

One last comment is that it would be good to double check we're getting the expected security level. Currently we're computing each r_i individually as r_i = Hash(pk_i,msg,sig_i). The Schwartz-Zippel lemma security bound holds only when we compute all r_i at once as (r_1,...,r_n) = Hash(pk_1,...,pk_n,msg,sig_1,...,sig_n). I think I informally sketched out the same bound for the method of r computation we're doing, but a proof should exist before putting this into production. The proof may have been for computing r_i individually as ri = Hash(r{i-1},pk_i,msg,sig_i) -- a third way of doing it again not necessarily equivalent security-wise to the other two. I can try to look into this more soon.

gtank commented 2 years ago

Besides in-line comments, one more suggestion: Batch could contain a field partial_hash and entries could be Vec<(PublicKey, Signature, Exponent)>. new could set partial_hash to

Params::new()
    .personal(b"bvblssig")
    .hash_length(32)
    .to_state()
    .update(message)
    .update(extra_data)

and add could then set Exponent to Fr::from_random_bytes(partial_hash.clone().update(public_key).update(signature).finalize()). This way you don't have to keep re-hashing the message and this may also be more amenable to future stream-ification.

Yeah, I like this!

One last comment is that it would be good to double check we're getting the expected security level. Currently we're computing each r_i individually as r_i = Hash(pk_i,msg,sig_i). The Schwartz-Zippel lemma security bound holds only when we compute all r_i at once as (r_1,...,r_n) = Hash(pk_1,...,pk_n,msg,sig_1,...,sig_n). I think I informally sketched out the same bound for the method of r computation we're doing, but a proof should exist before putting this into production. The proof may have been for computing r_i individually as ri = Hash(r{i-1},pk_i,msg,sig_i) -- a third way of doing it again not necessarily equivalent security-wise to the other two. I can try to look into this more soon.

Sounds good. This detail is hidden pretty deep here intentionally, so we can make it behave however it needs to.

psivesely commented 2 years ago

Streaming BLS batch verification checks the following (in the exponent):

s1 - h x1 + r1(s2 - h x2) + r2(s3 - h x3) + ... + rn-1(sn - h xn) = 0

where s_i = logg1(sig_i), x_i = logg2(pk_i), and h = logg1(HashG1(msg)).

If we choose ri = HashF(pki,msgi,sigi) then we can no longer view this as an application of the Fiat-Shamir heuristic, which asks you hash the whole transcript to produce each challenge, i.e., ri = HashF(pki,msgi,sigi). Practically speaking, with the former the adversary has more combinatorial ways of making the evaluation of this n-variate polynomial 0 even when it's not the 0 polynomial (i.e., when there are forgeries present) for the same number of hash computations.

For example using the current method for n=100. The adversary picks 2 signatures at random for each public key and computes the corresponding challenge. Note that each of these hashes determines a term ri-1(si - h xi). The adversary now has 2100 ways to combine them. If the adversary picked the signatures by their picking their discrete logs at random then they can compute e(msig, g2) for each multisignature using a single target group exponentiation. Similarly, pkri,j for i in {1,...n}, j in {0,1} can be precomputed so that computing e(HashG1(msg),apk) is mostly just the cost of the pairing. (This can't also be computed as a target group exponentiation because h is unknown and further the apk will be sparse in G2 imply negligible collision probability.) Then the adversary can test each combination for little more than the cost of one GT exponentiation and one pairing.

Using the Fiat-Shamir version, we focus on the point where the evaluation goes to 0 from any other value. This means the adversary has gotten lucky, and it's best this happens when they're submitting the last forgery they need to because otherwise the next forgery they send causes the evaluation to move away from 0 with certainty (since there are no zero-divisors in a field). Then, without loss of generality, an optimal strategy in the ROM is to just try getting lucky on the last message over and over again. Then to compute/check each time takes little more than one hash, one G2 exponentiation, one pairing, and one GT exponentiation to check.

Since the hash and G2 exponentiation take much less time than the pairing and GT exponentiation, I don't think concretely it will be much easier to attack the current version. But using the Fiat-Shamir version security can at least be argued for the interactive protocol using the Schwartz-Zippel lemma and then we can invoke Fiat-Shamir, which makes things nicer from a theory perspective. I confirmed that repeated application per round of Schwartz-Zippel for a univariate polynomial has the same soundness as a function of the challenge size as does one application for a (n-1)-variate polynomial and and thus ceil(128+log2(99))=135 bit challenges still gives us 128 bit soundness for the interactive streaming protocol.

psivesely commented 2 years ago

So I shouldn't spend more time on attacking these because fundamentally the Fiat-Shamir version (in the ROM) is going to guarantee we have to compute one hash per "try" as a lower bound, and is better for above-mentioned theory reasons. Just for fun though, I noticed you can reduce the cost to approximately one hash and one GT exponentiation as follows. Precompute

D = e(g1,g2)s1+...+rn-2sn-1 e(HashG1(msg),apk')-1

where apk' is computed from the first n-1 public keys and challenges. Then to test new sn, we compute rn at the cost of one G1 exponentiation (actually just one addition as we'll see) and one hash, and then check if

(e(g1,g2)sn e(HashG1(msg),pkn)-1)rn = D

which can be done in essentially just one target group exponentiation by precomputing the pairings and if we just increase sn by one each time, which is no more or less likely to succeed in the ROM than picking sn randomly (similarly, the G1 exponentiation would be replaced with an addition).

psivesely commented 2 years ago

In discussions elsewhere between @kobigurk-clabs, @gtank, and I we noted that since the apk and msig generated by this process are not published and are for local verification purposes only, it's okay to just sample randomness locally for the exponents. The apk and msig that gets placed on chain is just the simple sum of the pks and sigs (without taking the linear combination).

gtank commented 2 years ago

Sigh. Forgot about the clippy check again.

gtank commented 2 years ago

Rebased on top of the build fixes in #225.