Closed totaam closed 8 years ago
Raising as this will give us hardware acceleration, see #1029. See also #198
r11607 adds a "python-cryptography" backend and the ability to switch between the backends using an env var, some unit tests, etc. It still defaults to "pycrypto" for now, but the performance improvements are noticeable (running from the unit tests):
pycrypto took 165ms python-cryptography took 94ms
(that's encrypting + decrypting 64KB 20 times)
Still TODO:
- packaging updates
- more unit tests
- more testing
- quantify those performance improvements better, on different hardware (Xeon's AVX?)
- make python-cryptography the default
- profit
r11615 + r11616 makes it easier to compare different payload sizes and encryption vs decryption performance. Summary: python-cryptography always wins, sometimes by a huge margin: it isn't just faster overall, it has much lower latency too. This is such a big win that I will seriously consider backporting support to older branches.
I've measured the cipher initialization cost, and that's not the source of the problem: pycrypto just has a much higher call overhead.
Some results on a cheap and old i3 CPU, shown as pairs of elapsed time per iteration in milliseconds: pycrypto first, then python-cryptography. || Payload size: |||| 16B|||| 64B|||| 64KB|||| 1MB|| ||Encryption || 5.1|| 0.2|| 5.0|| 0.2|| 15.2|| 6.5|| 169|| 101|| ||Decryption || 4.6|| 0.2|| 4.7|| 0.3|| 92.5|| 82.4|| 1482|| 1391|| ||Both || 2.5|| 0.1|| 2.7|| 0.2|| 55.4|| 47.9|| 883|| 811|| The ~5ms encryption overhead on a 16 Byte payload is very significant. With bigger payload sizes, the win narrows to about 10%. I'm not sure why they're both so slow at decryption: about 10 times slower than encryption with 1MB of data! (could be large output buffer re-allocation?) The wins could be even more significant on platforms with hardware acceleration (see #876).
@afarr: you may want to run this unit test on some Xeons to see how they fare. And maybe we'll stick this in the wiki somewhere as a justification for switching to python-cryptography.
Ran the tests again on an Opteron system in order to verify that hardware acceleration is used (#1029) - it is. The gains are HUGE. On 64KB packets, more than 20 times faster!
Raising.
For the build platform and packaging support:
- on OSX: cryptography requires openssl. See work in progress patch, only one change needed for the configure step (smo?):
env ARCHFLAGS="-arch i386" ./config --prefix=${PREFIX}
- on win32: follow these instructions: Building cryptography on Windows, I used these win32 binaries: Win32 OpenSSL - you will also need [/attachment/ticket/876/cryptography-build-win32-openssl1.1.patch] for openssl 1.1 or later
- on Linux: r11630 adds the dependency for RPM packaging (all but centos 6.x and earlier), r11637 adds the DEB packaging
Extra changesets needed for a potential backport: r11639, r11636, r11629 These changes are needed for client side support, but could be omitted if we do just a simple server-only backport: r11640, r11637,
osx-openssl.patch
(1.6 KiB)almost complete patch for building openssl on osx as part of the moduleset
@smo: can you help me with this one? I believe that there will be some changes needed to packaging after this is done, on win32 we are missing lots of components (all of openssl..) giving us the stacktraces recorded in #1029
Not sure if this should be posted here or in #1029 - but when trying to test osx client 0.17.0 r11761 (for opus support) — I'm getting the following Error message client side (despite not using any encryption):
Schadenfreude:MacOS Schadenfreude$ ./xpra attach --opengl=on --speaker-codec=opus 2016-01-25 17:45:51,574 Error: encryption library python-cryptography failed validation! 2016-01-25 17:45:51,574 sha1 is not supported for PBKDF2 by this backend. 2016-01-25 17:45:51,615 Xpra gtk2 client version 0.17.0-[r11761](../commit/f8f71dec8c83490b8315fcfea297651405e82dc4)
[https://github.com/pyca/cryptography/issues/2039] has some packaging workarounds. As per #1029#comment:15, the warnings will remain until this is fixed.
- r11776 fixes the packaging for win32 (via some ugly hacks - not all of which may actually be needed) - fixed beta a build posted
- osx is more difficult, the packaging will need similar changes to the win32 ones, but before that we need to build:
@smo: can you please take a look?
Instructions for win32 worked with no issue. I got my precompiled binaries from this site [https://slproweb.com/products/Win32OpenSSL.html] not sure what it is that you used.
Still working on osx as this is my issue no matter what I try I always end up with
-arch x86_64 and -arch i386
when invoking gcc so I will continue to look for a solution. Just for the record this is the type of error I see.making all in crypto/objects... ar r ../../libcrypto.a o_names.o obj_dat.o obj_lib.o obj_err.o obj_xref.o ar: ../../libcrypto.a is a fat file (use libtool(1) or lipo(1) and ar(1) on it) ar: ../../libcrypto.a: Inappropriate file type or format make[2]: *** [lib] Error 1 make[1]: *** [subdirs] Error 1 make: *** [build_crypto] Error 1
It's worth noting that i'm trying to do this on a 64 bit osx system running el capitan
$ sw_vers ProductName: Mac OS X ProductVersion: 10.11.1 BuildVersion: 15B42
Almost complete as of r11865:
- openssl: this works for me and should force a 32-bit build:
./Configure -shared --prefix=${PREFIX} darwin-i386-cc make && make install
the moduleset does that automatically for me with the changes I made - you may need to tweak it for 64-bit builds.. Two things still need fixing for openssl:
- after the install step, it fails... without saying why! running the make install step from a terminal succeeds and returns no error code..
- the libssl and libcrypto libraries may need to be chmoded 755 (this could be a result of the install step failure)
- we needed enum34 and not enum..
- packaging: similar to win32, we also needed to include the libssl dylib, explicitly include the _ssl.so, and run the same backend injection workaround
With all this in place, I now have osx builds with python-cryptography (bumped to 1.2.2 in r11866, updated my win32 VM with the same version and openssl 1.0.2f)
@afarr: this is ready for testing.
(I have changed the ticket title to better reflect what has been done: we may deprecate and eventually drop pycrypto, but not just yet)
Still waiting on an osx client package... but ran a quick test (reminding myself the syntax from 1029#comment:6) with 0.17.0 r11886 win32 client against 0.17.0 11888 fedora 23 server.
With the acceleration, I got the following numbers (seem a little better than those mentioned in #1029):
[jimador@jimador net]$ python crypto_test.py .Encryption Performance: test_perf: size: 16 Bytes pycrypto took 4.5ms: 3KB/s python-cryptography took 0.2ms: 82KB/s test_perf: size: 1024 Bytes pycrypto took 4.5ms: 223KB/s python-cryptography took 0.2ms: 5055KB/s test_perf: size: 1048576 Bytes pycrypto took 13.9ms: 73698KB/s python-cryptography took 6.5ms: 158749KB/s Decryption Performance: test_perf: size: 16 Bytes pycrypto took 4.0ms: 3KB/s python-cryptography took 0.2ms: 83KB/s test_perf: size: 1024 Bytes pycrypto took 4.2ms: 240KB/s python-cryptography took 0.3ms: 3794KB/s test_perf: size: 1048576 Bytes pycrypto took 83.2ms: 12304KB/s python-cryptography took 76.8ms: 13328KB/s Global Performance: test_perf: size: 16 Bytes pycrypto took 2.2ms: 6KB/s python-cryptography took 0.1ms: 138KB/s test_perf: size: 1024 Bytes pycrypto took 2.3ms: 441KB/s python-cryptography took 0.1ms: 6939KB/s test_perf: size: 1048576 Bytes pycrypto took 49.4ms: 20739KB/s python-cryptography took 44.6ms: 22936KB/s . ---------------------------------------------------------------------- Ran 2 tests in 4.461s OK
With acceleration turned off:
[jimador@jimador net]$ OPENSSL_ia32cap="~0x200000200000000" python crypto_test.py .Encryption Performance: test_perf: size: 16 Bytes pycrypto took 4.5ms: 3KB/s python-cryptography took 0.2ms: 80KB/s test_perf: size: 1024 Bytes pycrypto took 4.4ms: 225KB/s python-cryptography took 0.2ms: 5149KB/s test_perf: size: 1048576 Bytes pycrypto took 13.6ms: 75115KB/s python-cryptography took 6.4ms: 159003KB/s Decryption Performance: test_perf: size: 16 Bytes pycrypto took 4.1ms: 3KB/s python-cryptography took 0.2ms: 86KB/s test_perf: size: 1024 Bytes pycrypto took 4.2ms: 239KB/s python-cryptography took 0.2ms: 4007KB/s test_perf: size: 1048576 Bytes pycrypto took 84.6ms: 12104KB/s python-cryptography took 76.1ms: 13464KB/s Global Performance: test_perf: size: 16 Bytes pycrypto took 2.2ms: 7KB/s python-cryptography took 0.1ms: 150KB/s test_perf: size: 1024 Bytes pycrypto took 2.3ms: 437KB/s python-cryptography took 0.1ms: 6924KB/s test_perf: size: 1048576 Bytes pycrypto took 49.6ms: 20648KB/s python-cryptography took 70.7ms: 14490KB/s . ---------------------------------------------------------------------- Ran 2 tests in 4.985s OK
Except for the last entry though, there doesn't seem to be much difference when using a VM as a server.
I guess I'll have to check into what the KVM is doing with the filtering after all.
@afarr: your (identical) numbers show that you are still NOT using any hardware acceleration.
Ran the tests on a hardware machine (mid range i5 4460) in Fedora 23 with trunk r11908:
[root@vorfuehreffekt-spikes-eng net]# python crypto_test.py .Encryption Performance: test_perf: size: 16 Bytes pycrypto took 2.3ms: 6KB/s python-cryptography took 0.1ms: 123KB/s test_perf: size: 1024 Bytes pycrypto took 2.3ms: 433KB/s python-cryptography took 0.1ms: 7936KB/s test_perf: size: 1048576 Bytes pycrypto took 8.9ms: 115468KB/s python-cryptography took 2.3ms: 440013KB/s Decryption Performance: test_perf: size: 16 Bytes pycrypto took 2.1ms: 7KB/s python-cryptography took 0.1ms: 127KB/s test_perf: size: 1024 Bytes pycrypto took 2.2ms: 461KB/s python-cryptography took 0.2ms: 6235KB/s test_perf: size: 1048576 Bytes pycrypto took 57.7ms: 17757KB/s python-cryptography took 49.9ms: 20505KB/s Global Performance: test_perf: size: 16 Bytes pycrypto took 1.2ms: 13KB/s python-cryptography took 0.1ms: 232KB/s test_perf: size: 1024 Bytes pycrypto took 1.2ms: 833KB/s python-cryptography took 0.1ms: 10438KB/s test_perf: size: 1048576 Bytes pycrypto took 34.7ms: 29470KB/s python-cryptography took 28.3ms: 36156KB/s . ---------------------------------------------------------------------- Ran 2 tests in 2.891s OK
and without acceleration:
[root@vorfuehreffekt-spikes-eng net]# OPENSSL_ia32cap="~0x200000200000000" python crypto_test.py .Encryption Performance: test_perf: size: 16 Bytes pycrypto took 2.3ms: 6KB/s python-cryptography took 0.1ms: 121KB/s test_perf: size: 1024 Bytes pycrypto took 2.3ms: 429KB/s python-cryptography took 0.1ms: 7621KB/s test_perf: size: 1048576 Bytes pycrypto took 8.8ms: 115707KB/s python-cryptography took 4.5ms: 225109KB/s Decryption Performance: test_perf: size: 16 Bytes pycrypto took 2.1ms: 7KB/s python-cryptography took 0.1ms: 125KB/s test_perf: size: 1024 Bytes pycrypto took 2.2ms: 459KB/s python-cryptography took 0.2ms: 5684KB/s test_perf: size: 1048576 Bytes pycrypto took 57.5ms: 17813KB/s python-cryptography took 52.2ms: 19616KB/s Global Performance: test_perf: size: 16 Bytes pycrypto took 1.1ms: 13KB/s python-cryptography took 0.1ms: 197KB/s test_perf: size: 1024 Bytes pycrypto took 1.2ms: 829KB/s python-cryptography took 0.1ms: 10266KB/s test_perf: size: 1048576 Bytes pycrypto took 34.5ms: 29716KB/s python-cryptography took 30.8ms: 33215KB/s . ---------------------------------------------------------------------- Ran 2 tests in 2.981s OK
I'm also getting similar results from another Fedora 23 machine with and without the acceleration. I have a feeling that either the acceleration isn't enabled, or we're not disabling it properly since I'm getting such similar numbers across multiple machines.
I also did some research and the CPU on both computers (identical CPUs) and that specific model supports AES encryption and decryption hardware acceleration.
I have a feeling that either the acceleration isn't enabled, or we're not disabling it properly since I'm getting such similar numbers across multiple machines. [[BR]] You need to bear in mind that:
- decryption is not accelerated (yet?) - something I guess we should ask python-cryptography and/or openssl developers about
- since the latency is so much lower already with python-cryptography, you need bigger packet sizes to be able to see the difference (1MB or more) - here we see that the performance is almost doubled
Bar issues with KVM and cpu flags, I think we can close this ticket.
Well, everything looks good except our getting the KVM flags sorted out.
I'll close this and just open another if, once that gets sorted out, I see an issue.
cryptography-build-win32-openssl1.1.patch
(0.5 KiB)fix win32 build against openssl 1.1
Note: in version 2.0, we are dropping support for pycrypto - that project looks dead: no releases in almost 3 years.
Issue migrated from trac ticket # 876
component: core | priority: blocker | resolution: fixed
2015-05-30 08:10:05: antoine created the issue