Following #256, I have further improved the performance of decryption when the dst and src are not overlapping. Because the ciphertext is predictable in decryption process, it is possible to enable the use of SIMD XOR functions (from Go standard library). I also eliminated some bound checks after checking the assemble.
Go linkname is used for backward compatibility since the XORBytes function was not exported until Go 1.20.
Sorry for the delay of this PR. 😢
Before: (encryption and decryption are the same performance)
Following #256, I have further improved the performance of decryption when the
dst
andsrc
are not overlapping. Because theciphertext
is predictable in decryption process, it is possible to enable the use of SIMD XOR functions (from Go standard library). I also eliminated some bound checks after checking the assemble.Go linkname is used for backward compatibility since the XORBytes function was not exported until Go 1.20.
Sorry for the delay of this PR. 😢
Before: (encryption and decryption are the same performance)
After:
Benchmark data was collected on GitHub Codespace virtual machine.