libressl / portable

LibreSSL Portable itself. This includes the build scaffold and compatibility layer that builds portable LibreSSL from the OpenBSD source code. Pull requests or patches sent to tech@openbsd.org are welcome.
https://www.libressl.org
1.37k stars 267 forks source link

Backport ARM optimizations #464

Open fancycode opened 8 years ago

fancycode commented 8 years ago

Are there any plans to backport the ARM optimizations for AES / SHA from OpenSSL or would you accept pull-requests for that?

Here are some numbers from a test using openssl speed aes sha with libressl 2.4.2 and a manually patched version.

Before on a PINE64 (AARCH64):

The 'numbers' are in 1000s of bytes per second processed.
type           16 bytes    64 bytes   256 bytes  1024 bytes  8192 bytes
aes-128 cbc   28521.69k   33748.60k   30074.13k   31178.17k   35079.63k
aes-192 cbc   23150.56k   29702.77k   30688.47k   26827.99k   28019.98k
aes-256 cbc   23123.81k   25178.38k   26551.77k   26595.91k   26138.12k
sha1           7921.47k   26396.15k   66696.15k   85532.61k  110228.95k
sha256         8706.44k   20926.16k   37359.95k   43264.19k   43429.97k
sha512         7948.05k   28142.02k   45253.94k   71734.86k   83039.19k

After on a PINE64 (AARCH64):

The 'numbers' are in 1000s of bytes per second processed.
type           16 bytes    64 bytes   256 bytes  1024 bytes  8192 bytes
aes-128 cbc  104979.10k  152962.08k  178409.00k  185506.59k  188549.29k
aes-192 cbc  100101.85k  143107.10k  165258.15k  171149.19k  165159.05k
aes-256 cbc   93587.91k  122029.78k  137511.71k  141336.30k  142887.92k
sha1          15134.96k   55214.93k  169086.08k  344973.76k  506773.78k
sha256        35288.84k  110453.54k  263977.77k  409388.60k  484152.75k
sha512        10484.83k   42153.70k   68483.89k   98582.39k  113457.81k

Before on an ODROID C1 (armhf):

The 'numbers' are in 1000s of bytes per second processed.
type           16 bytes    64 bytes   256 bytes  1024 bytes  8192 bytes
aes-128 cbc   22016.59k   24939.71k   26077.01k   26354.35k   26471.08k
aes-192 cbc   19410.53k   21667.78k   22520.49k   22746.45k   22825.64k
aes-256 cbc   17309.67k   19252.37k   19842.82k   19996.33k   20048.55k
sha1           4940.37k   14596.82k   33386.92k   48808.62k   57212.49k
sha256         5062.47k   11233.24k   19538.01k   23876.61k   25427.97k
sha512         1099.09k    4329.09k    6218.50k    8299.52k    9524.57k

After on an ODROID C1 (armhf):

The 'numbers' are in 1000s of bytes per second processed.
type           16 bytes    64 bytes   256 bytes  1024 bytes  8192 bytes
aes-128 cbc   26508.01k   32215.70k   34116.44k   34744.77k   34783.23k
aes-192 cbc   23741.46k   28279.55k   29733.89k   30125.06k   30236.67k
aes-256 cbc   21457.53k   25170.47k   26343.51k   26738.73k   26744.15k
sha1           6028.02k   18460.64k   43369.58k   66063.02k   77999.35k
sha256         8334.31k   20233.24k   37002.24k   46851.60k   50612.91k
sha512         2489.89k    9956.48k   14277.97k   19523.81k   21796.18k
4a6f656c commented 8 years ago

We would consider pull requests, providing the code is appropriately licensed (OpenSSL or BSD) and sufficiently tested (ideally via regress).

fancycode commented 8 years ago

I would look into backporting the code from OpenSSL, so the license should imho be ok.

0181532686cf4a31163be0bf3e6bb6732bf commented 6 years ago

@fancycode any news regarding this? looks like other archs are neglected as well (~3 years difference with mainline openssl version)

@4a6f656c do you know a reason why arch-specific assembly is not being backported? I mean, security implication or smth

busterb commented 5 years ago

I enabled arm 32-bit optimizations that were already in the tree to begin with. It should help with the cases above.