Closed edevil closed 8 years ago
Can you test this with PyPy and see how Cryptography on CPython compares to cryptography on PyPy?
PyPy 2.6.0 crashes and burns when I try to use cryptography:
Traceback (most recent call last): ]
File "<builtin>/app_main.py", line 75, in run_toplevel
File "stream_bench.py", line 25, in <module>
do_cryptography()
File "stream_bench.py", line 12, in do_cryptography
CRYPTO_BACKEND = default_backend()
File "/Users/andre/Downloads/pypy-2.6.0-osx64/site-packages/cryptography-0.9.3-py2.7-macosx-10.10-x86_64.egg/cryptography/hazmat/backends/__init__.py", line 40, in default_backend
_default_backend = MultiBackend(_available_backends())
File "/Users/andre/Downloads/pypy-2.6.0-osx64/site-packages/cryptography-0.9.3-py2.7-macosx-10.10-x86_64.egg/cryptography/hazmat/backends/__init__.py", line 27, in _available_backends
"cryptography.backends"
File "build/bdist.macosx-10.10-x86_64/egg/pkg_resources/__init__.py", line 2361, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/Users/andre/Downloads/pypy-2.6.0-osx64/site-packages/cryptography-0.9.3-py2.7-macosx-10.10-x86_64.egg/cryptography/hazmat/backends/commoncrypto/__init__.py", line 7, in <module>
from cryptography.hazmat.backends.commoncrypto.backend import backend
File "/Users/andre/Downloads/pypy-2.6.0-osx64/site-packages/cryptography-0.9.3-py2.7-macosx-10.10-x86_64.egg/cryptography/hazmat/backends/commoncrypto/backend.py", line 244, in <module>
backend = Backend()
File "/Users/andre/Downloads/pypy-2.6.0-osx64/site-packages/cryptography-0.9.3-py2.7-macosx-10.10-x86_64.egg/cryptography/hazmat/backends/commoncrypto/backend.py", line 44, in __init__
self._binding = Binding()
File "/Users/andre/Downloads/pypy-2.6.0-osx64/site-packages/cryptography-0.9.3-py2.7-macosx-10.10-x86_64.egg/cryptography/hazmat/bindings/commoncrypto/binding.py", line 44, in __init__
self._ensure_ffi_initialized()
File "/Users/andre/Downloads/pypy-2.6.0-osx64/site-packages/cryptography-0.9.3-py2.7-macosx-10.10-x86_64.egg/cryptography/hazmat/bindings/commoncrypto/binding.py", line 56, in _ensure_ffi_initialized
modules=cls._modules,
File "/Users/andre/Downloads/pypy-2.6.0-osx64/site-packages/cryptography-0.9.3-py2.7-macosx-10.10-x86_64.egg/cryptography/hazmat/bindings/utils.py", line 31, in load_library_for_binding
lib = ffi.verifier.load_library()
File "/Users/andre/Downloads/pypy-2.6.0-osx64/lib_pypy/cffi/verifier.py", line 97, in load_library
return self._load_library()
File "/Users/andre/Downloads/pypy-2.6.0-osx64/lib_pypy/cffi/verifier.py", line 207, in _load_library
return self._vengine.load_library()
File "/Users/andre/Downloads/pypy-2.6.0-osx64/lib_pypy/cffi/vengine_gen.py", line 86, in load_library
self._load(module, 'loaded', library=library)
File "/Users/andre/Downloads/pypy-2.6.0-osx64/lib_pypy/cffi/vengine_gen.py", line 112, in _load
method(tp, realname, module, **kwds)
File "/Users/andre/Downloads/pypy-2.6.0-osx64/lib_pypy/cffi/vengine_gen.py", line 446, in _loaded_gen_constant
value = self._load_constant(is_int, tp, name, module)
File "/Users/andre/Downloads/pypy-2.6.0-osx64/lib_pypy/cffi/vengine_gen.py", line 441, in _load_constant
value = function()
NotImplementedError: constant kCFTypeDictionaryKeyCallBacks: ctype 'CFDictionaryKeyCallBacks' not supported as return value (it is a struct declared with "...;", but the C calling convention may depend on the missing fields)
You'll need to use master rather than the PyPI release.
Using the latest PyPy:
PyPY - CPython 1- 0.35 - 2.98 2- 630MB/s - 760MB/s
So in the first test, PyPy with cryptography approaches M2Crypto's performance with CPython. However, in the second test it is even worse.
m2crypto is unbelievably annoying to compile (especially on OS X) so I haven't been able to look at this closely, but the small payload encryption overhead is likely due to the fact that many more Python objects are being generated by cryptography.
Streaming is mildly more puzzling. What happens if you raise the size of the read chunk?
I altered the memory allocation to reuse an update buffer (that was micro-optimizing for this particular benchmark and emphatically not a general solution) and was able to get ~10% more performance (on my laptop it went from 1.19GB/sec to 1.31GB/sec).
@reaperhulk - experimentally it seems M2Crypto is subtly incompatible with swig 3 (the latest). I was able to get something that appears to work by simply doing brew install swig2; pip wheel M2Crypto
.
If I increase the chunk size to 4x:
Regarding the increase in object creation by Cryptography, is it possible for me to reuse the objects created so as to lower this overhead? Alter the key, and reset the encryptor?
Not using exactly the same parts of the library, but I'm also interested in the answer to the question @edevil asked above (and whether there's actually any worthwhile savings there). In an SO post (http://stackoverflow.com/questions/31376763/how-to-cope-with-the-performance-of-generating-signed-urls-for-accessing-private) Cryptography comes up as an improvement on signing CloudFront URLs with the glacial rsa
package. Since our signing code is hot any savings would be appreciated, but after profiling, it looks like any gain from object re-use would probably be quite small.
@abathur When signing do you need to load different keys or sign repeatedly with the same key?
The SO question seems to indicate cryptography signs in ~1ms while the rsa
package takes 25ms, is that your finding as well or is it slower? Do you mind sharing the code/branch you used to test?
(I'm in the UK right now so I'm headed for bed but I can hopefully take a look at this tomorrow)
@reaperhulk Same key repeatedly. I've since posted my own answer on that question pondering the use of a shorter key to cut the signing time down. In my own testing (on a slower system, and for our whole signing routine) these values were more like 1.5ms for cryptography and 37ms for rsa. Using a 512-bit key instead of 2048 cut this time down to around 113µs, though I'm not sure whether this is a sane optimization given the limited use, or a terrible idea (see requisite security SE question... http://security.stackexchange.com/questions/94581/how-bad-an-idea-is-intentionally-using-short-rsa-keys-for-signing-cloudfront-pri).
It's just exploratory code as we're looking to start hosting the assets via CloudFront instead of just S3; here's the code (sans sensitive information, but with puns preserved).
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.asymmetric import padding
from cryptography.hazmat.backends import default_backend
crypkey = serialization.load_pem_private_key(
("-----BEGIN RSA PRIVATE KEY-----\n"
...
"-----END RSA PRIVATE KEY-----\n"),
password=None,
backend=default_backend()
)
wert = padding.PKCS1v15()
squirt = hashes.SHA1()
def sign_url_cloudfront(uri):
global crypkey, wert, squirt
global CLOUDFRONT_KEY_ID
global URL_SAFE
expires = int(time.time()) + VIDEO_LINK_DURATION
policy_statement = '{"Statement":[{"Resource":"%s","Condition":{"DateLessThan":{"AWS:EpochTime":%d}}}]}' % (uri, expires);
crikey = crypkey.signer(
wert,
squirt
)
sig = crikey.update(policy_statement)
return "{url:}?{params:}Expires={expires:}&Signature={signature:}&Key-Pair-Id={key_pair_id}".format(
url=uri,
params="", # incl & at end if used
expires=expires,
signature=base64.b64encode(crikey.finalize()).translate(URL_SAFE),
key_pair_id=CLOUDFRONT_KEY_ID
)
My answer on SO (http://stackoverflow.com/a/31551075/307542) includes a profile of 100 runs at each key length. When I was just looking at the results while using the 2048-bit key it seemed unlikely there'd be much cause for chasing small optimizations, but the profile at the shorter key length suggests there might be some benefit to an interface geared towards repetitious single-key signing of short messages which minimizes object creation.
Let me know if you want me to take this to a new issue :)
experimentally it seems M2Crypto is subtly incompatible with swig 3 (the latest).
(just a side-note) I am trying to revive M2Crypto on GitLab, and I would need as many people testing all those patches which should among other things help to fix problems with the current SWIG. For more see https://github.com/mcepl/M2Crypto/issues/5
</off-topic-comment>
I'm going to close this as there's not much we can do (besides suggest PyPy) for encrypting lots of small strings, while we have a way forward with #3119 for speeding up large streaming calls.
I'm trying to substitute M2crypto with cryptography, but in the two scenarios that I use cryptography performs significantly slower than M2Crypto. I've extracted 2 small scripts that demonstrate my use case.
1- Encrypting several small strings with different keys:
Running time:
So, in this case cryptography is 10x slower.
2- Encrypting a stream:
Performance:
In this case I get +- 1GB/s with M2Crypto while cryptography loses steam at 760MB/s.
Are these corner cases that cryptography simply does not yet handle correctly, or is this expected due to the difference in architecture between the libraries?