secure-io / siv-go

Go implementation of AES-SIV-CMAC and AES-GCM-SIV
MIT License
6 stars 3 forks source link

add AES-GCM-SIV amd64 assembler code for AES-CTR and POLYVAL #10

Closed aead closed 5 years ago

aead commented 5 years ago

This commit adds AMD64 assembler implementations for AES-CTR (AES-GCM-SIV) and POLYVAL. The assembler implementations are still quite generic and use possible optimiaztions like combining decryption and authentication in Open(...). Such more sophisticated optimizations will be introduced over time.

The AMD64 assembler code significantly improves performance on machines with AES-NI and PCLMULQDQ instruction:

name               old time/op    new time/op      delta
AES128GCMSeal64-4    5.24µs ± 0%      0.47µs ± 1%    -91.09%  (p=0.029 n=4+4)
AES128GCMSeal1K-4    57.1µs ± 0%       1.3µs ± 0%    -97.71%  (p=0.029 n=4+4)
AES128GCMSeal8K-4     445µs ± 0%         7µs ± 0%    -98.34%  (p=0.029 n=4+4)
AES128GCMOpen64-4    5.27µs ± 0%      0.48µs ± 0%    -90.82%  (p=0.029 n=4+4)
AES128GCMOpen1K-4    57.2µs ± 0%       1.3µs ± 1%    -97.70%  (p=0.029 n=4+4)
AES128GCMOpen8K-4     444µs ± 0%         7µs ± 0%    -98.34%  (p=0.029 n=4+4)
AES256GCMSeal64-4    5.49µs ± 1%      0.57µs ± 0%    -89.66%  (p=0.029 n=4+4)
AES256GCMSeal1K-4    57.9µs ± 0%       1.5µs ± 0%    -97.45%  (p=0.029 n=4+4)
AES256GCMSeal8K-4     449µs ± 0%         8µs ± 0%    -98.18%  (p=0.029 n=4+4)
AES256GCMOpen64-4    5.49µs ± 0%      0.59µs ± 0%    -89.32%  (p=0.029 n=4+4)
AES256GCMOpen1K-4    57.6µs ± 0%       1.5µs ± 0%    -97.40%  (p=0.029 n=4+4)
AES256GCMOpen8K-4     446µs ± 0%         8µs ± 0%    -98.16%  (p=0.029 n=4+4)

name               old speed      new speed        delta
AES128GCMSeal64-4  12.2MB/s ± 0%   137.1MB/s ± 1%  +1021.43%  (p=0.029 n=4+4)
AES128GCMSeal1K-4  17.9MB/s ± 0%   784.8MB/s ± 0%  +4273.17%  (p=0.029 n=4+4)
AES128GCMSeal8K-4  18.4MB/s ± 0%  1106.5MB/s ± 0%  +5911.82%  (p=0.029 n=4+4)
AES128GCMOpen64-4  12.1MB/s ± 0%   132.2MB/s ± 0%   +989.41%  (p=0.029 n=4+4)
AES128GCMOpen1K-4  17.9MB/s ± 0%   776.9MB/s ± 1%  +4241.63%  (p=0.029 n=4+4)
AES128GCMOpen8K-4  18.4MB/s ± 0%  1107.2MB/s ± 0%  +5907.46%  (p=0.029 n=4+4)
AES256GCMSeal64-4  11.7MB/s ± 1%   112.7MB/s ± 0%   +866.88%  (p=0.029 n=4+4)
AES256GCMSeal1K-4  17.7MB/s ± 0%   692.6MB/s ± 0%  +3813.22%  (p=0.029 n=4+4)
AES256GCMSeal8K-4  18.3MB/s ± 0%  1002.0MB/s ± 0%  +5386.18%  (p=0.029 n=4+4)
AES256GCMOpen64-4  11.7MB/s ± 0%   109.1MB/s ± 0%   +835.68%  (p=0.029 n=4+4)
AES256GCMOpen1K-4  17.8MB/s ± 0%   682.5MB/s ± 0%  +3739.66%  (p=0.029 n=4+4)
AES256GCMOpen8K-4  18.4MB/s ± 0%  1000.1MB/s ± 0%  +5347.14%  (p=0.029 n=4+4)