filecoin-project / research

Home for Filecoin Research
Other
74 stars 10 forks source link

Signature Aggregation #1

Closed bvohaska closed 5 years ago

bvohaska commented 6 years ago

MASTERPLAN

Storage Limits & Signature Aggregation

As @whyrusleeping has shown in his analysis ([1], [2], [3], [4]), we currently have a limitation on the total storage that FIL can support. In part this limitation is due to the number of signatures and other associated data posted to the blockchain [2]. Signature aggregation is one proposal to increase this storage limit. For the purpose of this issue, we will only consider aggregation as a mitigation to the storage limit; we discuss other methods here

What is signature aggregation?

In short, signature aggregation is the method by which we take a collection of signatures S and somehow collect them together so that they for a kind-of bulk signature for an item.

For example, let's consider a scenario where a group wants to all members to sign a message M. In a classic signature scheme, each member would compute,

s_i := Sign[SK_i,M] (1)

where i represents a particular member and SK_i is the secret key of the i-th member. Collecting these signatures together we have,

S := {s_0,s_1,...,s_n} (2)

where n is the number of members. S will have size,

 Sz:= n*size_of(s_i) (3)

Thus, the total storage needed for anyone to verify S is Sz + PK_all, where PK_all is the collection of all public keys used in S. If signatures are of size 64 bytes and PK_i is of size 64 bytes then total storage is n(64+64) or n128 bytes. in the case of 1-24 participants the size of S and associated data becomes 128KB. In the context of FIL with a block size of ~400KB this can become problematic quickly. Note: the actual messages m are left out of this calculation. Now let us consider what this scenario would look like in an aggregation scheme.

Suppose we designate an aggregator A. A then asks for all signatures s_i where the signing operation performs a special function to map m_i into a group element that we can operate on and adds a secret alpha_i [6]. This function is called hashing into a group and in our particular case is called hashing into an elliptic curve. For an explanation on how this works see hashing example in golang. The aggregator then collects these signatures together and performs a special aggregation function; in BLS, this function is the pairing operation over all public keys and associated messages. The result of this message is a signature Sagg such that the total size of the signature is on the order of a classical signature. Using the size analysis from above, the signature would be 64 bytes. The total size of the signature and other components is Sagg +PK_all. This is about n64 + 64 or (n+1)64 or about (n+1)/2*128 w.r.t. classical signatures.

Thus we can see that we have effectively halved the total signature storage cost (recall that we need to store the public keys). W.r.t just the signature bytes we have reduced the storage cost from n*s_i to just `s_i'.

Some notes on computation. And these are the gotchas. Notice that while we saved a ton of space, we did so at the cost of computation. In particular, h0 = hashing into an elliptic curve and pr = pairing. I note that there are efficient methods for h0. Individual pr may not be computationally expensive--further analysis will need to be done to determine if this is true-- but n computations of pr can become burdensome--again this will require prototyping and performance analysis.

A note of dev. Development of BLS signatures into production code can be...tricky. Elliptic curves are finicky, complicated, and prone to subtle but catastrophic security errors. Extra careful attention and testing will be needed to ensure that any aggregation scheme is correct where correct means that in all cases the actual computation is identical to the math that defined it.

Methods:

Information theory limit/ MPC

Wouldn't it be great of we could compress all those pesky public keys? It would and we can but this requires multi-party computation (MPC). A quick analysis of signatures and aggregations will convince you that without public keys, we would break the information theory limit for storing data--which is pretty cool to think about in the cryptographic context. We can, however, perform multiparty computation to generate a single signature with one public key but MPC has it's own considerations.


References:

1 - observable/visual analysis of storage limitations 2 - specs/storage market draft 3 - specs/thoughts on aggregation 4 - aq/issue on size constraints

Signature aggregation:

5 - specs/BLS signature discussion 6 - stanford/new BLS scheme explanation 7 - medium/signature aggregation

nicola commented 6 years ago

Thank you for this Brian, I think there is a set of articles by Blockstream and Dan Boneh (I will try to look for them), which discuss aggregation (which is not multisig) to save blocksize

bvohaska commented 5 years ago

All aggregation scheme may assume a PKI-like system. In other words, looking up public keys as well as signing with public keys can be assumed.

Links:

Forum discussion about BLS signatures ETH Research BLS Sigs Logos - BLS Aggregation MuSig - MultiSigs Tree Sigs MAST Sig Ring Sigs

bvohaska commented 5 years ago

Note: there is a namespace collision in literature. Signature aggregation often refers to Multi-signatures. As far as we can tell, there is only one signature aggregation (compression) scheme that meets our requirements: BLS Signature aggregation.

Viable Methods for Signature Aggregation

BLS Signature aggregation

Given a collection of n BLS signatures, it it possible to aggregate them by computing n elliptic curve additions. These descriptions will follow: link to BLS sig description.

Generate Aggregate:

sig_i = {g_1, H_0(m_i)^sk_i} , H_0 is a hash into the G_0, g_1 is in G_1

S = {Set of n BLS signatures: sig}

for each sig <-- S:

    SIG = sig_0 + sig_1 + ... + sig_n-1 (n elliptic curve additions)

SIG is about 96 bytes

Verify Aggregate:

M = {set of all messages m_i}

Check,

    e(g_1,SIG) == e(pk_0,H_0(m_0))*e(pk_1,H_0(m_1))*...*e(pk_n-1,H_0(m_n-1)) 

    This represents n pairings and n hashes into G_0

Notes: all sigs are points on an elliptic curve. B/c the pairing choice is bilinear we have e(P+Q,Z) = e(P,Z)*e(Q,Z). We also have e(c*P,d*Q) = e(P,Q)^(c*d).

Pros

Cons

Performance of each aggregation scheme

The following results were generated using code found here. The machine used was a 2018 Macbook Pro with i7-8th gen processor-4 cores (only 1 core used).

BLS signatures were computed using Chia's python library that binds to their c++ library. ECDSA was computed using python-cryptography; which ports to OpenSSL. ECPY was used to compute EC key generation, ECPY - ECDSA, and ECPY - Schnorr.

In an apples-to-apples comparison,

======================== Set up ======================== Num of msgs to be signed: 10000 Boilerplate BLS time: 14285.76024 ms Boilerplate ECDSA time: 10639.419682 ms EC Key generation time: 34169.680336 ms ECPY - Schnorr sig gen time: 34220.123154999994 ms ECPY - ECDSA sig gen time: 34719.815919000015 ms ====================== Aggregation ===================== BLS-Agg sig gen time: 2442.0399390000007 ms ===================== Verification ===================== ECDSA sig ver time: 4299.750928999998 ms BLS-Agg sig ver time: 11244.094109000001 ms ================= Compare Apples-Apples ================= ECPY - Schnorr sig ver time: 68152.72110299997 ms ECPY - ECDSA sig ver time: 69753.58654599998 ms ===================== Other BLS Stats =================== BLS-Agg sig deserialize time: 1825.7945999999947 ms Avg. per sig gen time: 0.24420399390000005 ms Avg. per sig ver time: 1.1244094109000002 ms Length of aggregate signature: 96 bytes Did the signature verify? T/F: True

What to glean from the results:

These results tell us that for 1 core on an i7-8xxx,

Generation:

  1. Schnorr & ECDSA are the same. They use the same keys and have the same signature sizes.

  2. BLS aggregation takes about 3s (not including serialization). BLS aggregation can happen on-the-fly and thus can be computed as messages come in to the aggregator.

Verification:

  1. Schnorr is about the same speed as ECDSA. At best Schnorr is ~1% faster. This may be due to the calculation of k^-1 in ECDSA.

  2. ECDSA verification for all messages is about 2-3x faster than verification for one BLS aggregate.

  3. For 10k messages, ECDSA verification takes 4.3s to compute and BLS-Agg verification 11.2s (+2s for deserialization)

Things to consider

Per conversation with @whyrusleeping, we will need to be mindful that in order for a block to finalize, we need signature verification to be very fast. Each miner in the network will need to verify a block and pass the block along; this chain and resulting chain-delay can add to long finality times.

We might be able to speed things up by taking advantage of parallelization.

whyrusleeping commented 5 years ago

to clarify the consideration at the bottom of the last comment:

We can repropogate blocks after just validating the signature on the block and the ticket (meaning, that the miner in question did actually win a block). But we have to fully validate it before mining on top of it.

bvohaska commented 5 years ago

As it turns out the BLS curves used in rust-proofs are the same used in most BLS signature schemes. As it also turns out, our implementation may be significantly faster than the BLS implementation used in Chia.

@dignifiedquire is investigating using our curve implementation to speed up BLS signature aggregation and verification.


The reason we believe rust-proofs elliptic curves may be faster than those in Chia is:

  1. Our curves are optimized to use fixed size integers specific to BLS12-384; Chia's uses GMP and thus arbitrary precision integers.
  2. Our implementation uses zcash's heavily optimized EC math operations. Chia's uses Relic's which might be slower.
dignifiedquire commented 5 years ago

I made an implementation based on the pairing library in rust, which gets these times a little nicer

https://github.com/filecoin-project/bls-signatures

> cargo run --example verify --release
dancing with 10000 messages
    signing
      took 4.251403s (0.425ms per message)
    aggregate signatures
      took 0.006371s (0.001ms per message)
    serialize signature
      took 0.000006s
    hashing messages
      took 3.664272s (0.366ms per message)
    extracting public keys
      took 0.356862s (0.036ms per message)
    deserialize signature
      took 0.001370s
    verification
      took 3.105231s (0.311ms per message)
bvohaska commented 5 years ago

Confirming @dignifiedquire's results with same machine that generate previous benches.

> cargo run --example verify --release
dancing with 10000 messages
    signing
      took 7.431936s (0.743ms per message)
    aggregate signatures
      took 0.009970s (0.001ms per message)
    serialize signature
      took 0.000024s
    hashing messages
      took 5.893373s (0.589ms per message)
    extracting public keys
      took 0.503084s (0.050ms per message)
    deserialize signature
      took 0.001248s
    verification
      took 3.386371s (0.339ms per message)

Compared to the Chia library, @dignifiedquire is much faster.

Total time to verify is 8.8s (hashing + verify) here and 11.2s in Chia. Hashing represents the bulk of the work at ~6s here. Deserialization is very expensive in Chia at 3s vs. 0.0001s here. This makes the total verification time 8.9s here and 14s in Chia. If we can optimize g2_hash (the hashing step here) then we could potentially gain a few more seconds.

bvohaska commented 5 years ago

Updates:

BLS signature verification can be made very fast by hashing a message into G1 (small group); this has to do with G1 having a smaller cofactor groups than G2. The trade-off is much larger public keys (2x size). We will need to figure out how to store those pubic keys and look them up. We will also need to figure out a key recover options. In ECDSA one can extract a public key from a signature; In BLS that does not appear to be immediately possible.

bvohaska commented 5 years ago

Presentation of current state to be done at the research retreat Dec. 7-11

65

bvohaska commented 5 years ago

Requires people resources to continue.

bvohaska commented 5 years ago

Resources identified. BLS Effort kicking off the week of 7 Jan. 2019.

bvohaska commented 5 years ago

Link to workplan notes: https://docs.google.com/document/d/1iATTJY0bd-psc-PaCfXbtMsI0DrYgA1IV097QL4lPWg/edit

bvohaska commented 5 years ago

Go-filecoin issue: https://github.com/filecoin-project/go-filecoin/issues/1566

mhammersley commented 5 years ago

moving to "in progress", per research week discussion

nicola commented 5 years ago

hey @whyrusleeping we believe there are no more research open issues in here and I am seeing a set of issues on signatures, if specs is covering this elsewhere, please close this issue (and link).

nicola commented 5 years ago

I am closing this issue, unless the stakeholders will find this relevant again. This is being tracked in the specs repo.