mimblewimble / grin

Minimal implementation of the Mimblewimble protocol.
https://grin.mw/
Apache License 2.0
5.04k stars 990 forks source link

Investigate bulletproof verification batch sizes #2704

Open DavidBurkett opened 5 years ago

DavidBurkett commented 5 years ago

This blog post by @apoelstra about bulletproofs suggests that there are minimal (if any) performance gains for batching more than 64 bulletproofs at a time. Currently, during initial sync, we batch verify in groups of 1000, which requires reserving space in memory for 1000 bulletproofs (up to ~660kB). I think it's worth investigating what kind of impact, if any, lowering the batch size would have on performance.

antiochp commented 5 years ago

I was (maybe mistakenly) under the impression libsecp256k1 was doing this in batches of 64 internally. We were just passing them in in batches of 1,000 from grin code for convenience.

I'm not actually sure where I got this idea from now - would need to go look back at the libsecp256k1 verification code to confirm/deny.

DavidBurkett commented 5 years ago

I haven't seen that anywhere, but I'm no expert on secp256k1-zkp. If that's the case though, there's even more reason to avoid creating the huge 1000 bulletproof vector.

antiochp commented 5 years ago

:+1: definitely worth benchmarking what this looks like with batches of 64.

apoelstra commented 5 years ago

You can run the bench_ecmult program in libsecp256k1-zkp to measure the performance. First bulletproof costs 144 ecmults (plus G). Every additional one in a batch adds an extra 12, IIRC.

There has never been an "internal batches of 64" mechanism in libsecp-zkp.

garyyu commented 5 years ago

Batching 1,000 bulletproof verification is an optimized value which has been done before, I can't find it but I can run it again:

batch size average time consumed (us)
100 864
500 734
1,000 729
2,000 697

And regarding the cost of memory space, we reserved a big scratch space for that, no matter this bulletproof batch size, it's already there and fixed (256M):

const MAX_WIDTH: usize = 1 << 20;
const SCRATCH_SPACE_SIZE: size_t = 256 * MAX_WIDTH;
const MAX_COMMITS_IN_RANGEPROOF: size_t = 1 << 10;
const RANGEPROOF_NBITS: size_t = 64;
const MAX_GENERATORS: size_t = 2 * RANGEPROOF_NBITS * MAX_COMMITS_IN_RANGEPROOF;

If you want to optimize the memory size, you need define what's the maximum possible batch size, then we can modify it to fit.

yeastplume commented 4 years ago

Good to close?