cloudflare / circl

CIRCL: Cloudflare Interoperable Reusable Cryptographic Library
http://blog.cloudflare.com/introducing-circl
Other
1.22k stars 136 forks source link

Remove scalar sha3 amd64 assembly #429

Closed bwesterb closed 1 year ago

bwesterb commented 1 year ago

Somewhat surprisingly this leads to a small speed up. Results will obviously vary per platform, but unless assembly gives a dramatic and clear speed up, we shouldn't be bothered to maintain it.

Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz

name                   old time/op   new time/op   delta
PermutationFunction-8    378ns ± 1%    355ns ± 3%  -6.12%  (p=0.000 n=10+9)
Sha3_512_MTU-8          7.73µs ± 1%   8.45µs ±23%  +9.30%  (p=0.003 n=9+10)
Sha3_384_MTU-8          5.61µs ± 4%   5.65µs ±12%    ~     (p=0.853 n=10+10)
Sha3_256_MTU-8          4.47µs ± 6%   4.50µs ±13%    ~     (p=0.579 n=10+10)
Sha3_224_MTU-8          4.21µs ± 4%   4.06µs ± 5%  -3.67%  (p=0.001 n=10+10)
Shake128_MTU-8          3.62µs ± 2%   3.43µs ± 2%  -5.30%  (p=0.000 n=9+10)
Shake256_MTU-8          3.93µs ± 2%   3.77µs ± 4%  -4.06%  (p=0.000 n=10+10)
Shake256_16x-8          55.3µs ± 1%   54.8µs ± 5%    ~     (p=0.315 n=9+10)
Shake256_1MiB-8         3.03ms ± 3%   3.08ms ± 8%    ~     (p=0.353 n=10+10)
Sha3_512_1MiB-8         5.61ms ± 3%   5.37ms ± 2%  -4.20%  (p=0.000 n=10+10)

name                   old speed     new speed     delta
PermutationFunction-8  530MB/s ± 1%  564MB/s ± 3%  +6.53%  (p=0.000 n=10+9)
Sha3_512_MTU-8         175MB/s ± 1%  161MB/s ±19%  -7.76%  (p=0.003 n=9+10)
Sha3_384_MTU-8         241MB/s ± 4%  240MB/s ±11%    ~     (p=0.853 n=10+10)
Sha3_256_MTU-8         302MB/s ± 5%  301MB/s ±12%    ~     (p=0.579 n=10+10)
Sha3_224_MTU-8         321MB/s ± 4%  333MB/s ± 5%  +3.83%  (p=0.001 n=10+10)
Shake128_MTU-8         373MB/s ± 2%  394MB/s ± 2%  +5.62%  (p=0.000 n=9+10)
Shake256_MTU-8         343MB/s ± 3%  358MB/s ± 4%  +4.24%  (p=0.000 n=10+10)
Shake256_16x-8         296MB/s ± 1%  299MB/s ± 5%    ~     (p=0.315 n=9+10)
Shake256_1MiB-8        347MB/s ± 3%  341MB/s ± 7%    ~     (p=0.353 n=10+10)
Sha3_512_1MiB-8        187MB/s ± 3%  195MB/s ± 2%  +4.38%  (p=0.000 n=10+10)