kste / sha256_avx

A fast implementation for parallel processing single block SHA256.
MIT License
14 stars 10 forks source link

suggestion: more detailed benchmarks #1

Open thiemonagel opened 7 years ago

thiemonagel commented 7 years ago

2.7 cycles/byte looks very interesting. How does that compare against other implementations at 64 Byte input size? (It seems that benchmarks typically quote cycles/byte for large input sizes.)

kste commented 7 years ago

I have some additional benchmarks available in https://eprint.iacr.org/2017/898.pdf, which also provides a comparison on different platforms (Figure 5 and 6) between some of the most promising candidates for hashing short inputs.

The implementations are also available here (but you might have to extract them from the SPHINCS implementation for x86): https://github.com/kste/sphincs

thiemonagel commented 7 years ago

Thanks! From a user perspective, I'd be most interested in how your implementation compares to other implementations, e.g. how many cycles/byte would openssl take for 64 Byte input?

kste commented 6 years ago

Ah I see. On Skylake https://bench.cr.yp.to/results-hash.html reports 19.66 cycles/byte (for the "OpenSSL_1.0.2g__1_Mar_2016" implementation) for processing 64 bytes. However, this implementation also pads the message, so the 64 bytes column will actually lead to processing two message blocks.

For a single block, extrapolating from the 8 byte column, I would estimate it to be ~11 cycles/byte.

thiemonagel commented 6 years ago

Cool, thanks. Maybe a useful addition to README.md?