dalek-cryptography / ed25519-dalek

Fast and efficient ed25519 signing and verification in Rust.
BSD 3-Clause "New" or "Revised" License
681 stars 225 forks source link

Signature verification performance gap #87

Closed bagavi closed 5 years ago

bagavi commented 5 years ago

As per the Ed25519 paper, the single signature verification should take 8 micro seconds. However, the code takes 717 micro seconds per verification step.

What is the main reason behind this gap? Am I missing something?

tarcieri commented 5 years ago

I'm not sure where the paper claims "8 micro seconds". Here's what the paper claims:

This paper shows that a $390 mass-market quad-core 2.4GHz Intel Westmere (Xeon E5620) CPU can create 109000 signatures per second and verify 71000 signatures per second

Note "quad-core" CPU (weird way to present these benchmarks as opposed to per-core performance, but I digress). So to ballpark the single-thread performance it's more like 71000 / 4 = 17750, or ~56 microseconds per signature per core.

I happen to maintain Signatory, a multi-provider signature crate which also functions as a multi-backend benchmarking suite which tests across multiple Ed25519 Rust ecosystem crates including ring and sodiumoxide. All three libraries, per my last benchmark on an Intel Xeon E3-1225 v5 @ 3.30GHz, perform roughly the same as the paper. You can see details here:

https://github.com/tendermint/signatory#ed25519-providers

ed25519-dalek (with I believe the avx2 backend) performed the best with ~50 microseconds per signature verification performance. That's slightly slower than the benchmarks given in the README for this crate, which are ~45 microseconds on a Intel Skylake i9-7900X running at 3.30 GHz.

717 microseconds to verify a single signature seems egregiously slow. What is your benchmarking environment? Are you doing a release build?

bagavi commented 5 years ago

My bad. I forgot to take "quad-core" into account.

Release build gave me 51 micro seconds! Thanks for the help.

hdevalence commented 5 years ago

I'm not sure how the paper computes the numbers given in the abstract, but I think they're batch-verification numbers.