Closed decaz closed 4 years ago
One of the biggest challenges in this project is maintaining bindings which we do not consume either in this project or in pyOpenSSL. As part of 3.1 we removed a significant number of bindings that we don't use, and this was one of them. Could you explain the exact way you use this? I'd like to create a minimal PKCS7 signing interface in cryptography that you can use instead of directly calling the binding. That way you have a higher level API and also don't have the risk of us breaking you when we need to modify the bindings for whatever reason (which include wanting smaller binaries, compatibility with additional OpenSSL versions/forks, reducing RAM consumption when compiling, et cetera).
@reaperhulk I'm using it to sign data with S/MIME (refs #1621). Here is the function I use:
from cryptography.hazmat.backends.openssl.rsa import _RSAPrivateKey
from cryptography.hazmat.backends.openssl.x509 import _Certificate
from cryptography.hazmat.bindings.openssl.binding import Binding as SSLBinding
def sign_pkcs7(data: bytes, cert: _Certificate, pkey: _RSAPrivateKey) -> bytes:
flags = SSLBinding.lib.PKCS7_BINARY | SSLBinding.lib.PKCS7_DETACHED
bio_in = SSLBinding.lib.BIO_new_mem_buf(data, len(data))
try:
pkcs7 = SSLBinding.lib.PKCS7_sign(
cert._x509, pkey._evp_pkey, SSLBinding.ffi.NULL, bio_in, flags
)
finally:
SSLBinding.lib.BIO_free(bio_in)
bio_in = SSLBinding.lib.BIO_new_mem_buf(data, len(data))
try:
bio_out = SSLBinding.lib.BIO_new(SSLBinding.lib.BIO_s_mem())
try:
SSLBinding.lib.SMIME_write_PKCS7(bio_out, pkcs7, bio_in, flags)
result_buffer = SSLBinding.ffi.new('char**')
buffer_length = SSLBinding.lib.BIO_get_mem_data(
bio_out, result_buffer
)
output = SSLBinding.ffi.buffer(result_buffer[0], buffer_length)[:]
finally:
SSLBinding.lib.BIO_free(bio_out)
finally:
SSLBinding.lib.BIO_free(bio_in)
return output
So you're generating a detached binary signature from some bytes, a cert, and a private key? Are there other common flags we need to consider supporting in an API? e.g. what about attached sigs and non-binary (does it PEM encode then?).
Yes, it's detached binary signature from some bytes, certificate and private key which I'm getting by calling load_pem_x509_certificate
and load_pem_private_key
functions respectively.
About flags, I don't know anything about other of them but I think it would be nice to have them as an API parameter. Maybe someone in the community will need different flags.
We ran into the same problem with version 3.1. The snippet which uses PKCS7
related API: https://github.com/ros2/sros2/blob/5d57584af3db0a5542ccfbf3d9ee43740ecbcca4/sros2/sros2/api/_utilities.py#L135-L171
Okay, thanks for the information. I want to think a bit about API here, but I'm leaning towards a minimal initial API with pkcs7_detached_sign
. We can add more methods (like verification!) as consumers express their needs. We'll also need to support the NO_BINARY
flag (for binary data that should not be converted to MIME canonical format) as well as PKCS7_TEXT
to set the MIME headers when signing. This could look like a list of enums passed as an argument. Finally, I want to support setting the hash since the default is SHA1 (this can be done via an obnoxious dance within OpenSSL using a few additional APIs). Many users may require SHA1 and that's fine, but I refuse to build an API that only allows a bad hash function. ๐
The ultimate goal here is to support what we need while leaking as little OpenSSL specific behavior as possible through our abstraction.
One of the biggest challenges in this project is maintaining bindings which we do not consume either in this project or in pyOpenSSL. As part of 3.1 we removed a significant number of bindings that we don't use, and this was one of them. [...] I'd like to create a minimal PKCS7 signing interface in cryptography that you can use instead of directly calling the binding. [...] The ultimate goal here is to support what we need while leaking as little OpenSSL specific behavior as possible through our abstraction.
That makes perfect sense, although depending on the timeline of the new API, would it make sense to add the binding back until the API is ready? Or do you think we'll be able to get this relatively quickly?
As currently scoped this doesnโt seem too difficult. Iโm going to put it in our next milestone.
Glad to hear it, would love to help test!
Question for everyone: Both of these code samples pass the data in both PKCS7_sign
and SMIME_write_PKCS7
. This appears to create an additional pkcs7-data
structure. Is this intended?
@reaperhulk I didn't quite understand the question. Did you mean if we need this intermediate data (pkcs7
variable) at the output in the form of some kind of structure?
You're passing the data you want to sign twice -- in both PKCS7_sign
itself and also in SMIME_write_PKCS7
. From some quick experiments this generates a significantly different final output than just passing it one or the other location. It's unclear to me why you are passing it twice though. Is there a specific reason or did you just write some code and verify that it worked on the other end?
I'm reluctant to write something that generates this specific structure without understanding the cases where you do and don't want to do this.
@reaperhulk the function I use is not written by me, - I found it by reference within #1621. After remembering I see it was https://github.com/ros2/sros2/pull/129 and there is comment which describes why there is a second buffer was used:
# PKCS7_sign consumes the buffer; allocate a new one again to get it into the final document
.
It's been a while since I wrote the function @decaz is referencing, but if I recall it made the difference between writing only the sig to the file versus writing the file + sig. I won't claim much expertise though, that function took me forever to get working properly (there's a reason I want to see this properly exposed via a sane API), so if you see something wrong please do shout.
difference between writing only the sig to the file versus writing the file + sig.
That's my recollection as well. Without this finagling, the content being signed remains absent within the enclosing signature.
Okay, so passing detached on the first call gives you a signature and then the data gets embedded in the second call -- or at least that's the idea. So this isn't a detached SMIME output -- it embeds the data (which makes sense given the pkcs7-data
structure I saw). I will investigate a bit more, see if there are multiple permutations of these APIs that can produce identical output (probably! OpenSSL's APIs here are just terrible), and think a bit more about what this API should look like.
I think this might be the relevant example of this in the openssl smime demos:
I've pushed a branch (https://github.com/reaperhulk/cryptography/tree/pkcs7-bindings) that has a new module smime with a detached_smime_sign
function. There are some missing sanity checks, some unanswered questions about what smime options should be allowed (currently binary and text), but you can see example usage in test_smime.py
. If anyone here can test this and determine whether or not it works for their use case that'd be great.
@reaperhulk I tested it and signing is working. But verification is not on my side and it failed. I guess the reason is that md
argument is not NULL as it was in original snippet. Can you make hash_algorithm
parameter to be optional?
@decaz if you pass hashes.SHA1()
it should be the same as NULL (since SHA1 is chosen by default). If that doesn't work we can try NULL, but I suspect I've got other bugs ๐
@reaperhulk I tested with hash_algorithm=hashes.SHA1
.
Hmm, okay. I wonder if PKCS7_sign_add_signer
can't be made to work exactly like a single PKCS7_sign
call...
I've pushed a version that passes NULL to PKCS7_sign_add_signer
no matter what (which of course makes it SHA1 no matter what you pass). I doubt this will work, but let's eliminate it before I need to figure out how to build a verifier to test more.
@reaperhulk I tried again with the latest commit and unfortunately the result is the same =/
Thanks for the testing, I'll look into this more deeply soon.
Might be worth adding the smime verification function so the python tests could close the loop with a end-to-end sign/verify test.
I've also been using these unsupported functions (sorry!). I rely on them for a signature that I need to authenticate with the Argentinain tax agency.
My signing function is quite self contained though:
PKCS7_NOSIGS = 0x4 # defined in pkcs7.h
def create_embeded_pkcs7_signature(data: bytes, cert: str, key: str):
"""
Creates an embeded ("nodetached") pkcs7 signature.
This is equivalent to the output of::
openssl smime -sign -signer cert -inkey key -outform DER -nodetach < data
"""
assert isinstance(data, bytes)
assert isinstance(cert, str)
try:
pkey = crypto.load_privatekey(crypto.FILETYPE_PEM, key)
signcert = crypto.load_certificate(crypto.FILETYPE_PEM, cert)
except crypto.Error as e:
raise exceptions.CorruptCertificate from e
bio_in = crypto._new_mem_buf(data)
pkcs7 = crypto._lib.PKCS7_sign(
signcert._x509, pkey._pkey, crypto._ffi.NULL, bio_in, PKCS7_NOSIGS
)
bio_out = crypto._new_mem_buf()
crypto._lib.i2d_PKCS7_bio(bio_out, pkcs7)
signed_data = crypto._bio_to_string(bio_out)
return signed_data
The notable difference is that, apparently, no-one else is using PKCS7_NOSIGS
. Aside from that, it would seem that branch pkcs7-bindings would work for me.
BTW: I really appreciate your effort here!
@WhyNotHugo Thanks for providing your use case! I'm intrigued by PKCS7_NOSIGS
. Looking at the source code it doesn't look like PKCS7_sign
does anything with that flag; it only appears relevant within PKCS7_verify
(where it prevents checking signatures on the data). Do you see failures if you don't pass PKCS7_NOSIGS
?
Re replaced PKCS7_NOSIGS
with 0
and all my tests pass (including live integration tests). :man_facepalming:
I wrote this code many years ago, I've no idea how it ended up that way. Thanks for the feedback! :)
@WhyNotHugo ๐ OpenSSL's flags are baffling in the best of times so no surprise there!
I've been working on this some more and I think I may re-architect the branch pretty significantly. Instead of a function I think this is actually a builder (like the X509 builders). So it will look more like:
builder = smime.SMIMEBuilder()
sig = builder.add_signer(
certificate, key, hashes.SHA256()
).add_signer(
certificate2, key2, hashes.SHA1()
).add_data(
my_data
).sign(encoding, options)
Options will be a list of enum elements tentatively defined as:
class SMIMEOptions(Enum):
Text
Binary
DetachedSignature
NoCertificates
This model allows for multiple signers, different hash functions per signer, et cetera. Of course, we still have the problem of getting it right. These APIs allow for entirely too many permutations. For both my own notes and some context for people joining this thread:
PKCS7_sign
can directly sign, but it can also have a PKCS7_PARTIAL
flag passed. If you pass that flag then you don't have to supply anything in this call. If you sign with this call it will only use SHA1.PKCS7_sign_add_signer
allows alternate hash algorithms and multiple signers. It also has a flags
argument.PKCS7_PARTIAL
(which we want to do so we can use alternate hash algorithms) PKCS7_final
can be used to finalize the structure. However, SMIME_write_PKCS7
also finalizes it. Both take flags
. I hope SMIME_write_PKCS7
does not re-sign if the structure is already finalized, but I haven't confirmed that.i2d_PKCS7_bio
does not finalize the structure, but i2d_PKCS7_bio_stream
does (this interplays with PKCS7_STREAM
, which will not be supported at this time).As you might imagine, I have some concerns about potential issues across all these different flags
and finalization paths. ๐
What about having a function that returns a signed object:
signed = pkcs7_sign(data, cert, key, options)
signed_twice = pkcs7_sign(signed, cert2, key2, options)
print(str(signed_twice)) # prints the signed data
def pkcs7_sign(data: Union[str, bytes, SignedValue], cert, key, options) ->SignedValue:
...
class SignedValue:
# contains a reference to data, key and cert.
def to_string(self):
# When called, does the actual signing.
# If `data` is another SignedValue, it signs with all the keys in the chain.
...
def to_bytes(self):
...
def __str__(self):
return self.to_string()
This keeps the syntax super simple for simple use cases (just sign data with one key), but allows it to scale for more complex usecases.
@WhyNotHugo While we can have multiple signers, it doesn't appear that we can have multiple different data sections with OpenSSL's APIs. So a secondary signing API would need to not take data. The options also cause re-signing to occur so the second set of options would actually override (some) of the first set. Additionally, we have no concept of an opaque PKCS7 type right now and I'm reluctant to create one.
I've pushed the new prototype builder API to the branch (https://github.com/reaperhulk/cryptography/tree/pkcs7-bindings) for people to test out. I was able to verify signatures generated via the binary path using PKCS7_verify
. The PEM path is not validating yet -- if anyone using that path can give me a bit of information about flags being passed to PKCS7_verify that'd be great.
Edit: I figured out how to validate a signature generated via SMIME_write_PKCS7
. When loading you need to let it parse the cleartext signed data into the bcont value of SMIME_read_PKCS7(BIO *bio, BIO **bcont);
. In the case of PKCS7_TEXT
signed data it turns out it signs: Content-Type: text/plain\r\n\r\nthe data we want to sign is this amazing sequence of bytes
when the original data is the data we want to sign is this amazing sequence of bytes
.
Usage:
from cryptography.hazmat.primitives import smime
options = [smime.SMIMEOptions.DetachedSignature, smime.SMIMEOptions.Text]
sig = (
smime.SMIMESignatureBuilder()
.add_data(data)
.add_signer(cert, key, hashes.SHA256())
.sign(smime.SMIMEEncoding.PEM, options)
)
Maybe just parameters?
(..).sign(smime.SMIMEEncoding.PEM, signature_detached=True, binary=False)
(..).sign(smime.SMIMEEncoding.PEM, signature_detached=True, binary=True)
I've also considered a similar API while working on this. We may end up using it in the end, but by project policy we try not to use defaults like that so it would require users to make choices for each arg. Additional args being added in the future would also be tricky given the optional backend
argument we append to the end, although that's not an insurmountable difficulty.
@reaperhulk I've checked the latest commit and now signed data successfully passed verification!
@decaz excellent!
@dirk-thomas @kyrofa @WhyNotHugo Does the current branch work for your cases?
If this branch does work I'll need to figure out how to write decent tests here. Probably going to need some ASN.1 parsing to ensure that future changes don't affect the resulting PKCS7 data structures.
I'd just like to add my use case to the chorus. I'm verifying that a detached SMIME signature is valid and then in another microservice, verifying that the signature was created with the certificate we expect. I don't think the second verification is affected by this, as it's not using the PCKS7 bindings, but I haven't been able to validate that because the first verification has to pass before it will reach the second.
# Validate the signature
def validate_signature(self) -> bool:
body = self.body.encode('utf-8')
signature = decode(self.signature.encode('utf-8'), 'base64')
data = load_pkcs7_data(FILETYPE_ASN1, signature)
output_buffer = _new_mem_buf()
return Binding.lib.PKCS7_verify(
data._pkcs7,
Binding.ffi.NULL,
Binding.ffi.NULL,
_new_mem_buf(body),
output_buffer,
Binding.lib.PKCS7_NOVERIFY
) == 1
# Validate that the signature was created using the certificate we have stored.
def validate_certificate(self) -> bool:
database = self._database
udid = self.udid
signature = self.signature
result = # Read the certificate from the database
is_valid = True
if result:
row, *_ = result
pem, ca = row
data = decode(signature.encode('utf-8'), 'base64')
content_info = ContentInfo.load(data)
signed: SignedData = content_info['content']
store = X509Store()
store.add_cert(load_certificate(FILETYPE_PEM, ca.encode('utf-8')))
pem_cert = load_certificate(FILETYPE_PEM, pem.encode('utf-8'))
store.add_cert(pem_cert)
try:
for c in signed['certificates']:
cert = load_certificate(FILETYPE_ASN1, c.chosen.dump())
X509StoreContext(store, cert).verify_certificate()
is_valid = pem_cert.to_cryptography() == cert.to_cryptography()
except X509StoreContextError as e:
is_valid = False
return is_valid
And the actual error I'm getting is this.
AttributeError: module 'lib' has no attribute 'PKCS7_verify'
It seems that the certificates need to be loaded into a different class than before, but I'm not quite sure how the key should be loaded.
This is my [adapted] code:
def create_embeded_pkcs7_signature(data: bytes, cert: str, key: str):
"""
Creates an embedded ("nodetached") PKCS7 signature.
This is equivalent to the output of::
openssl smime -sign -signer cert -inkey key -outform DER -nodetach < data
"""
pkey = crypto.load_privatekey(crypto.FILETYPE_PEM, key)
signcert = crypto.load_certificate(crypto.FILETYPE_PEM, cert)
options = [smime.SMIMEOptions.DetachedSignature, smime.SMIMEOptions.Text]
signed_data = (
smime.SMIMESignatureBuilder()
.add_data(data)
.add_signer(signcert, pkey, hashes.SHA256())
.sign(smime.SMIMEEncoding.PEM, options)
)
return signed_data
It raises:
def add_signer(self, certificate, key, hash_algorithm):
if not isinstance(hash_algorithm, hashes.HashAlgorithm):
raise TypeError("hash_algorithm must be a hash")
if not isinstance(certificate, x509.Certificate):
> raise TypeError("certificate must be a x509.Certificate")
E TypeError: certificate must be a x509.Certificate
I was going to try using load_pem_x509_certificate
, but I'm not sure how to load the key itself.
Oh, figured it out looking at the tests in that branch :facepalm:
This snippet is working for me now:
from cryptography import x509
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives import smime
from OpenSSL import crypto
def create_embeded_pkcs7_signature(data: bytes, cert: str, key: str):
"""Creates an embedded ("nodetached") PKCS7 signature.
This is equivalent to the output of::
openssl smime -sign -signer cert -inkey key -outform DER -nodetach < data
"""
pkey = serialization.load_pem_private_key(key, None)
signcert = x509.load_pem_x509_certificate(cert.encode())
signed_data = (
smime.SMIMESignatureBuilder()
.add_data(data)
.add_signer(signcert, pkey, hashes.SHA256())
.sign(smime.SMIMEEncoding.Binary, [smime.SMIMEOptions.Binary])
)
return signed_data
Some thoughts (from the POV of a consumer of the library, rather than a developer of it):
str
, but the other takes bytes
.@WhyNotHugo load_pem_private_key
is documented as requiring bytes-like
input (https://cryptography.io/en/latest/hazmat/primitives/asymmetric/serialization/#cryptography.hazmat.primitives.serialization.load_pem_private_key). We should probably be enforcing that with our typical bytes-like
checks actually, but if you pass a Python3 str
to it then you'll get TypeError: from_buffer() cannot return the address of a unicode object
. If there's a way to pass non-bytes to any of the APIs that I've missed I'm happy to fix it though!
@alliefitter since you're passing PKCS7_NOVERIFY
you're just doing basic sig sanity checks using the embedded cert? I think we'll re-add the PKCS7_verify
call as part of this even if I don't have a verification API figured out yet... As a straw man it might look something like:
def smime_verify(smime, certs, verify_chain, no_embedded_certs, option3, ...)
You're right, looks like my type annotations above were wrong, but mypy never picked it up. It's all bytes! :+1:
Branch works perfect for me!
@reaperhulk +1 from me, it's lovely. You can replace my entire function with this:
def _sign_bytes(cert, key, byte_string):
return smime.SMIMESignatureBuilder(
byte_string).add_signer(
cert, key, hashes.SHA256()
).sign(smime.SMIMEEncoding.PEM, (smime.SMIMEOptions.Text, smime.SMIMEOptions.DetachedSignature))
After far too much additional work the PR is now available for review at #5465. The API is pretty close to the testing branch, but take a look at the docs and let me know if there's anything I've missed.
Any idea when 3.2 (or 3.1.1?) will be released?
Is there a replacement for crypto._lib.PKCS7_sign now?
I am getting `AttributeError: module 'lib' has no attribute 'PKCS7_sign' when I try to use it. Please advise!!
This is my use case. I am trying to convert some php code for sending safari notifications to python. PHP has some built in methods which are not there for python and I need to run this.
` p12 = load_certificates(config)
with manifest.open("rb") as file: buffer_in = crypto._new_mem_buf(file.read())
PKCS7_DETACHED = 0x40 PKCS7_BINARY = 0x80
pkcs7 = crypto._lib.PKCS7_sign( p12.get_certificate()._x509, p12.get_privatekey()._pkey, crypto._ffi.NULL, buffer_in, PKCS7_BINARY | PKCS7_DETACHED)
buffer_out = crypto._new_mem_buf() crypto._lib.i2d_PKCS7_bio(buffer_out, pkcs7) der = crypto._bio_to_string(buffer_out) with signature.open("wb") as file: file.write(der) `
@gjohary Please read the read you are replying to, the answer to why this is happening, and how to fix it is widely explained with multiple examples. See also #5465, which adds a new API.
@alex @reaperhulk great work on adding sign
. I am testing this and can't seem to find a way to add an additional certificate. Is this missing?
The background is that my x509 certificate is from a commercial CA. But - as it's commonly the case - it is not signed directly by the Root CA (instead it's signed from an intermediate/sub CA). So when I send a message to a new contact I would definitely want to include my certificate AND the certificate of the intermediate CA.