it seems like I misread in #95 and we don't use the cached group checks feature if we directly use blst_core_verify_pk_in_g1
Note on multithreading
Aggregate verification for Eth2 sync has 2 layers of batching.
Here we batch verification of multi-signatures of a single message for which taking advantage of elliptic curve properties is enough and we should see no benefits from multithreading.
But we can also batch verification of multiple messages, each having an aggregate signature and this benefits from parallelism and will have a superlinear speedup as we can save operations in particular only have 1 final exponentiation per thread which is about 55% of the pairing work https://github.com/mratsim/constantine/pull/85 (and 45% Miller Loop)
i.e. assuming 2ms verification per batch/block, verifying 10 blocks would cost
(1ms Miller Loop + 1ms Final exponentiation) * 10 = 20ms
but on a dual-core we would pay per core (not counting overhead which would be the order of microseconds not milliseconds)
1ms Miller Loop * 10/2 + 1ms Final Exponentiation = 6ms
This re-adds benchmarks for both BLST and Miracl backends
Oct26 BLST
Also here is BLST as in current nimbus master (commit 3878b9b with https://github.com/supranational/blst/tree/aae0c7d70b799ac269ff5edf29d8191dbd357876 submodule from Oct26)
Dec 5 BLST
With current blscurve master 282d1f6 with https://github.com/supranational/blst/tree/7cda6fa09bfa9d789bd30b31dc1ae91656ee2f88
Comments
The perf improvement are mentioned in #95
Looking at https://github.com/supranational/blst/compare/aae0c7d70b799ac269ff5edf29d8191dbd357876...7cda6fa09bfa9d789bd30b31dc1ae91656ee2f88#diff-eae0ac250c8fb45d2a6a0ba76423b19a9518cdea138d4802d62c45d2c5ca3afeL514
it seems like I misread in #95 and we don't use the cached group checks feature if we directly use blst_core_verify_pk_in_g1
Note on multithreading
Aggregate verification for Eth2 sync has 2 layers of batching.
Here we batch verification of multi-signatures of a single message for which taking advantage of elliptic curve properties is enough and we should see no benefits from multithreading.
But we can also batch verification of multiple messages, each having an aggregate signature and this benefits from parallelism and will have a superlinear speedup as we can save operations in particular only have 1 final exponentiation per thread which is about 55% of the pairing work https://github.com/mratsim/constantine/pull/85 (and 45% Miller Loop)
i.e. assuming 2ms verification per batch/block, verifying 10 blocks would cost
but on a dual-core we would pay per core (not counting overhead which would be the order of microseconds not milliseconds)