Closed tchaloupka closed 8 years ago
I've been pushing this off for a little time now, but it would require some ASM optimizations: https://github.com/randombit/botan/blob/master/src/lib/math/mp/mp_x86_64/mp_asmi.h
Perf gives me the following hot spots: 23.11% benchmark benchmark [.] _D5botan4math2mp7mp_core8word_addFNammPmZm 16.26% benchmark benchmark [.] _D5botan4math2mp7mp_core10word_madd3FNammmPmZm 9.91% benchmark benchmark [.] _D5botan4math2mp7mp_core10word_madd2FNammPmZm
I'll try and merge a fix for this soon
I think LDC is going to be needed here. I'll invest some time on compiling with that instead.
Yep, I would also not bother much with dmd on this and it would be nice to have some numbers from LDC or GDC.
As soon as 2.067 is supported in LDC the plan was to add support. I can't use asm pure nothrow
with the current version.
I'm guessing the optimizations from LLVM will close the gap on this benchmark, dmd has a lot of known codegen missing features and 46x is reasonable given the complexity of these algorithms and the opportunities that other compilers can use.
I added an openssl engine that pipes all Big number operations through OpenSSL, and also added LDC support, and it's still 5-6x slower for RS256 down from 46x. I'm going to improve it towards the point where it pipes public key operations directly through the high-level openssl functions like RSA_sign
. I think it's going to be hard to beat OpenSSL in terms of manually tweaking the performance for these because LDC doesn't have manual inlining yet so there's a lot of overhead that can't be eliminated
I decided to put this through perf
and apparently the problem was with loadKey
doing a lot of checks. Putting the private key in a static variable reduces the difference with openssl to about 3.5x on x86_64, which I deem more acceptable. However, I will add encryption/decryption/signing/verification engines for high-level crypto objects in the openssl engine to make up for the performance gap when it is needed.
Out of curiosity I tried simple benchmark of JWTD library (https://github.com/chalucha/jwtd/tree/benchmark/benchmark)
With dmd-2.068 it resulted in: dub -c openssl -b release
dub -c botan -b release
JWT None is not using openssl neither botan, so it's the same. There is a huge difference (48x) with RS256.
I know that DMD is bad for any benchmarks, but unfortunatelly it does not build for me with any of: GDC (Gentoo 4.8.4 p1.6, pie-0.6.1) 4.8.4 LDC - the LLVM D compiler (0.15.1) based on DMD v2.066.1 and LLVM 3.6.0