pyca / cryptography

cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.
https://cryptography.io
Other
6.64k stars 1.52k forks source link

PKCS7_sign error since version 3.1 #5433

Closed decaz closed 4 years ago

decaz commented 4 years ago
$ python -V
Python 3.8.5

$ pip list
Package      Version
------------ -------
cffi         1.14.2
cryptography 3.1
pip          20.2.2
pycparser    2.20
setuptools   49.6.0
six          1.15.0

$ python
Python 3.8.5 (default, Jul 22 2020, 17:45:49)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cryptography.hazmat.bindings.openssl.binding import Binding as SSLBinding
>>> SSLBinding.lib.PKCS7_sign
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'lib' has no attribute 'PKCS7_sign'
>>>

$ pip install cryptography==3.0
...
Successfully installed cryptography-3.0

$ python
Python 3.8.5 (default, Jul 22 2020, 17:45:49)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cryptography.hazmat.bindings.openssl.binding import Binding as SSLBinding
>>> SSLBinding.lib.PKCS7_sign
<built-in method PKCS7_sign of _cffi_backend.Lib object at 0x7f501012c630>
>>>
reaperhulk commented 4 years ago

One of the biggest challenges in this project is maintaining bindings which we do not consume either in this project or in pyOpenSSL. As part of 3.1 we removed a significant number of bindings that we don't use, and this was one of them. Could you explain the exact way you use this? I'd like to create a minimal PKCS7 signing interface in cryptography that you can use instead of directly calling the binding. That way you have a higher level API and also don't have the risk of us breaking you when we need to modify the bindings for whatever reason (which include wanting smaller binaries, compatibility with additional OpenSSL versions/forks, reducing RAM consumption when compiling, et cetera).

decaz commented 4 years ago

@reaperhulk I'm using it to sign data with S/MIME (refs #1621). Here is the function I use:

from cryptography.hazmat.backends.openssl.rsa import _RSAPrivateKey
from cryptography.hazmat.backends.openssl.x509 import _Certificate
from cryptography.hazmat.bindings.openssl.binding import Binding as SSLBinding

def sign_pkcs7(data: bytes, cert: _Certificate, pkey: _RSAPrivateKey) -> bytes:
    flags = SSLBinding.lib.PKCS7_BINARY | SSLBinding.lib.PKCS7_DETACHED
    bio_in = SSLBinding.lib.BIO_new_mem_buf(data, len(data))
    try:
        pkcs7 = SSLBinding.lib.PKCS7_sign(
            cert._x509, pkey._evp_pkey, SSLBinding.ffi.NULL, bio_in, flags
        )
    finally:
        SSLBinding.lib.BIO_free(bio_in)
    bio_in = SSLBinding.lib.BIO_new_mem_buf(data, len(data))
    try:
        bio_out = SSLBinding.lib.BIO_new(SSLBinding.lib.BIO_s_mem())
        try:
            SSLBinding.lib.SMIME_write_PKCS7(bio_out, pkcs7, bio_in, flags)
            result_buffer = SSLBinding.ffi.new('char**')
            buffer_length = SSLBinding.lib.BIO_get_mem_data(
                bio_out, result_buffer
            )
            output = SSLBinding.ffi.buffer(result_buffer[0], buffer_length)[:]
        finally:
            SSLBinding.lib.BIO_free(bio_out)
    finally:
        SSLBinding.lib.BIO_free(bio_in)
    return output
reaperhulk commented 4 years ago

So you're generating a detached binary signature from some bytes, a cert, and a private key? Are there other common flags we need to consider supporting in an API? e.g. what about attached sigs and non-binary (does it PEM encode then?).

decaz commented 4 years ago

Yes, it's detached binary signature from some bytes, certificate and private key which I'm getting by calling load_pem_x509_certificate and load_pem_private_key functions respectively.

About flags, I don't know anything about other of them but I think it would be nice to have them as an API parameter. Maybe someone in the community will need different flags.

dirk-thomas commented 4 years ago

We ran into the same problem with version 3.1. The snippet which uses PKCS7 related API: https://github.com/ros2/sros2/blob/5d57584af3db0a5542ccfbf3d9ee43740ecbcca4/sros2/sros2/api/_utilities.py#L135-L171

reaperhulk commented 4 years ago

Okay, thanks for the information. I want to think a bit about API here, but I'm leaning towards a minimal initial API with pkcs7_detached_sign. We can add more methods (like verification!) as consumers express their needs. We'll also need to support the NO_BINARY flag (for binary data that should not be converted to MIME canonical format) as well as PKCS7_TEXT to set the MIME headers when signing. This could look like a list of enums passed as an argument. Finally, I want to support setting the hash since the default is SHA1 (this can be done via an obnoxious dance within OpenSSL using a few additional APIs). Many users may require SHA1 and that's fine, but I refuse to build an API that only allows a bad hash function. ๐Ÿ˜„

The ultimate goal here is to support what we need while leaking as little OpenSSL specific behavior as possible through our abstraction.

kyrofa commented 4 years ago

One of the biggest challenges in this project is maintaining bindings which we do not consume either in this project or in pyOpenSSL. As part of 3.1 we removed a significant number of bindings that we don't use, and this was one of them. [...] I'd like to create a minimal PKCS7 signing interface in cryptography that you can use instead of directly calling the binding. [...] The ultimate goal here is to support what we need while leaking as little OpenSSL specific behavior as possible through our abstraction.

That makes perfect sense, although depending on the timeline of the new API, would it make sense to add the binding back until the API is ready? Or do you think we'll be able to get this relatively quickly?

reaperhulk commented 4 years ago

As currently scoped this doesnโ€™t seem too difficult. Iโ€™m going to put it in our next milestone.

kyrofa commented 4 years ago

Glad to hear it, would love to help test!

reaperhulk commented 4 years ago

Question for everyone: Both of these code samples pass the data in both PKCS7_sign and SMIME_write_PKCS7. This appears to create an additional pkcs7-data structure. Is this intended?

decaz commented 4 years ago

@reaperhulk I didn't quite understand the question. Did you mean if we need this intermediate data (pkcs7 variable) at the output in the form of some kind of structure?

reaperhulk commented 4 years ago

You're passing the data you want to sign twice -- in both PKCS7_sign itself and also in SMIME_write_PKCS7. From some quick experiments this generates a significantly different final output than just passing it one or the other location. It's unclear to me why you are passing it twice though. Is there a specific reason or did you just write some code and verify that it worked on the other end?

I'm reluctant to write something that generates this specific structure without understanding the cases where you do and don't want to do this.

decaz commented 4 years ago

@reaperhulk the function I use is not written by me, - I found it by reference within #1621. After remembering I see it was https://github.com/ros2/sros2/pull/129 and there is comment which describes why there is a second buffer was used: # PKCS7_sign consumes the buffer; allocate a new one again to get it into the final document.

kyrofa commented 4 years ago

It's been a while since I wrote the function @decaz is referencing, but if I recall it made the difference between writing only the sig to the file versus writing the file + sig. I won't claim much expertise though, that function took me forever to get working properly (there's a reason I want to see this properly exposed via a sane API), so if you see something wrong please do shout.

ruffsl commented 4 years ago

difference between writing only the sig to the file versus writing the file + sig.

That's my recollection as well. Without this finagling, the content being signed remains absent within the enclosing signature.

reaperhulk commented 4 years ago

Okay, so passing detached on the first call gives you a signature and then the data gets embedded in the second call -- or at least that's the idea. So this isn't a detached SMIME output -- it embeds the data (which makes sense given the pkcs7-data structure I saw). I will investigate a bit more, see if there are multiple permutations of these APIs that can produce identical output (probably! OpenSSL's APIs here are just terrible), and think a bit more about what this API should look like.

ruffsl commented 4 years ago

I think this might be the relevant example of this in the openssl smime demos:

reaperhulk commented 4 years ago

I've pushed a branch (https://github.com/reaperhulk/cryptography/tree/pkcs7-bindings) that has a new module smime with a detached_smime_sign function. There are some missing sanity checks, some unanswered questions about what smime options should be allowed (currently binary and text), but you can see example usage in test_smime.py. If anyone here can test this and determine whether or not it works for their use case that'd be great.

decaz commented 4 years ago

@reaperhulk I tested it and signing is working. But verification is not on my side and it failed. I guess the reason is that md argument is not NULL as it was in original snippet. Can you make hash_algorithm parameter to be optional?

reaperhulk commented 4 years ago

@decaz if you pass hashes.SHA1() it should be the same as NULL (since SHA1 is chosen by default). If that doesn't work we can try NULL, but I suspect I've got other bugs ๐Ÿ˜„

decaz commented 4 years ago

@reaperhulk I tested with hash_algorithm=hashes.SHA1.

reaperhulk commented 4 years ago

Hmm, okay. I wonder if PKCS7_sign_add_signer can't be made to work exactly like a single PKCS7_sign call...

I've pushed a version that passes NULL to PKCS7_sign_add_signer no matter what (which of course makes it SHA1 no matter what you pass). I doubt this will work, but let's eliminate it before I need to figure out how to build a verifier to test more.

decaz commented 4 years ago

@reaperhulk I tried again with the latest commit and unfortunately the result is the same =/

reaperhulk commented 4 years ago

Thanks for the testing, I'll look into this more deeply soon.

ruffsl commented 4 years ago

Might be worth adding the smime verification function so the python tests could close the loop with a end-to-end sign/verify test.

WhyNotHugo commented 4 years ago

I've also been using these unsupported functions (sorry!). I rely on them for a signature that I need to authenticate with the Argentinain tax agency.

My signing function is quite self contained though:

PKCS7_NOSIGS = 0x4  # defined in pkcs7.h

def create_embeded_pkcs7_signature(data: bytes, cert: str, key: str):
    """
    Creates an embeded ("nodetached") pkcs7 signature.
    This is equivalent to the output of::
        openssl smime -sign -signer cert -inkey key -outform DER -nodetach < data
    """

    assert isinstance(data, bytes)
    assert isinstance(cert, str)

    try:
        pkey = crypto.load_privatekey(crypto.FILETYPE_PEM, key)
        signcert = crypto.load_certificate(crypto.FILETYPE_PEM, cert)
    except crypto.Error as e:
        raise exceptions.CorruptCertificate from e

    bio_in = crypto._new_mem_buf(data)
    pkcs7 = crypto._lib.PKCS7_sign(
        signcert._x509, pkey._pkey, crypto._ffi.NULL, bio_in, PKCS7_NOSIGS
    )
    bio_out = crypto._new_mem_buf()
    crypto._lib.i2d_PKCS7_bio(bio_out, pkcs7)
    signed_data = crypto._bio_to_string(bio_out)

    return signed_data

The notable difference is that, apparently, no-one else is using PKCS7_NOSIGS. Aside from that, it would seem that branch pkcs7-bindings would work for me.

BTW: I really appreciate your effort here!

reaperhulk commented 4 years ago

@WhyNotHugo Thanks for providing your use case! I'm intrigued by PKCS7_NOSIGS. Looking at the source code it doesn't look like PKCS7_sign does anything with that flag; it only appears relevant within PKCS7_verify (where it prevents checking signatures on the data). Do you see failures if you don't pass PKCS7_NOSIGS?

WhyNotHugo commented 4 years ago

Re replaced PKCS7_NOSIGS with 0 and all my tests pass (including live integration tests). :man_facepalming:

I wrote this code many years ago, I've no idea how it ended up that way. Thanks for the feedback! :)

reaperhulk commented 4 years ago

@WhyNotHugo ๐Ÿ˜„ OpenSSL's flags are baffling in the best of times so no surprise there!

I've been working on this some more and I think I may re-architect the branch pretty significantly. Instead of a function I think this is actually a builder (like the X509 builders). So it will look more like:

builder = smime.SMIMEBuilder()
sig = builder.add_signer(
    certificate, key, hashes.SHA256()
).add_signer(
    certificate2, key2, hashes.SHA1()
).add_data(
    my_data
).sign(encoding, options)

Options will be a list of enum elements tentatively defined as:

class SMIMEOptions(Enum):
    Text
    Binary
    DetachedSignature
    NoCertificates

This model allows for multiple signers, different hash functions per signer, et cetera. Of course, we still have the problem of getting it right. These APIs allow for entirely too many permutations. For both my own notes and some context for people joining this thread:

As you might imagine, I have some concerns about potential issues across all these different flags and finalization paths. ๐Ÿ˜„

WhyNotHugo commented 4 years ago

What about having a function that returns a signed object:

signed = pkcs7_sign(data, cert, key, options)
signed_twice = pkcs7_sign(signed, cert2, key2, options)
print(str(signed_twice))  # prints the signed data
def pkcs7_sign(data: Union[str, bytes, SignedValue], cert, key, options) ->SignedValue:
    ...

class SignedValue:
    # contains a reference to data, key and cert.
    def to_string(self):
        # When called, does the actual signing.
        # If `data` is another SignedValue, it signs with all the keys in the chain.
        ...
    def to_bytes(self):
        ...

    def __str__(self):
       return self.to_string()

This keeps the syntax super simple for simple use cases (just sign data with one key), but allows it to scale for more complex usecases.

reaperhulk commented 4 years ago

@WhyNotHugo While we can have multiple signers, it doesn't appear that we can have multiple different data sections with OpenSSL's APIs. So a secondary signing API would need to not take data. The options also cause re-signing to occur so the second set of options would actually override (some) of the first set. Additionally, we have no concept of an opaque PKCS7 type right now and I'm reluctant to create one.

reaperhulk commented 4 years ago

I've pushed the new prototype builder API to the branch (https://github.com/reaperhulk/cryptography/tree/pkcs7-bindings) for people to test out. I was able to verify signatures generated via the binary path using PKCS7_verify. The PEM path is not validating yet -- if anyone using that path can give me a bit of information about flags being passed to PKCS7_verify that'd be great.

Edit: I figured out how to validate a signature generated via SMIME_write_PKCS7. When loading you need to let it parse the cleartext signed data into the bcont value of SMIME_read_PKCS7(BIO *bio, BIO **bcont);. In the case of PKCS7_TEXT signed data it turns out it signs: Content-Type: text/plain\r\n\r\nthe data we want to sign is this amazing sequence of bytes when the original data is the data we want to sign is this amazing sequence of bytes.

reaperhulk commented 4 years ago

Usage:

    from cryptography.hazmat.primitives import smime
    options = [smime.SMIMEOptions.DetachedSignature, smime.SMIMEOptions.Text]
    sig = (
        smime.SMIMESignatureBuilder()
        .add_data(data)
        .add_signer(cert, key, hashes.SHA256())
        .sign(smime.SMIMEEncoding.PEM, options)
    )
WhyNotHugo commented 4 years ago

Maybe just parameters?

(..).sign(smime.SMIMEEncoding.PEM, signature_detached=True, binary=False)

(..).sign(smime.SMIMEEncoding.PEM, signature_detached=True, binary=True)
reaperhulk commented 4 years ago

I've also considered a similar API while working on this. We may end up using it in the end, but by project policy we try not to use defaults like that so it would require users to make choices for each arg. Additional args being added in the future would also be tricky given the optional backend argument we append to the end, although that's not an insurmountable difficulty.

decaz commented 4 years ago

@reaperhulk I've checked the latest commit and now signed data successfully passed verification!

reaperhulk commented 4 years ago

@decaz excellent!

@dirk-thomas @kyrofa @WhyNotHugo Does the current branch work for your cases?

If this branch does work I'll need to figure out how to write decent tests here. Probably going to need some ASN.1 parsing to ensure that future changes don't affect the resulting PKCS7 data structures.

alliefitter commented 4 years ago

I'd just like to add my use case to the chorus. I'm verifying that a detached SMIME signature is valid and then in another microservice, verifying that the signature was created with the certificate we expect. I don't think the second verification is affected by this, as it's not using the PCKS7 bindings, but I haven't been able to validate that because the first verification has to pass before it will reach the second.

# Validate the signature
def validate_signature(self) -> bool:
    body = self.body.encode('utf-8')
    signature = decode(self.signature.encode('utf-8'), 'base64')
    data = load_pkcs7_data(FILETYPE_ASN1, signature)
    output_buffer = _new_mem_buf()
    return Binding.lib.PKCS7_verify(
        data._pkcs7,
        Binding.ffi.NULL,
        Binding.ffi.NULL,
        _new_mem_buf(body),
        output_buffer,
        Binding.lib.PKCS7_NOVERIFY
    ) == 1
# Validate that the signature was created using the certificate we have stored.
def validate_certificate(self) -> bool:
    database = self._database
    udid = self.udid
    signature = self.signature
    result = # Read the certificate from the database
    is_valid = True
    if result:        
        row, *_ = result
        pem, ca = row
        data = decode(signature.encode('utf-8'), 'base64')
        content_info = ContentInfo.load(data)
        signed: SignedData = content_info['content']
        store = X509Store()
        store.add_cert(load_certificate(FILETYPE_PEM, ca.encode('utf-8')))
        pem_cert = load_certificate(FILETYPE_PEM, pem.encode('utf-8'))
        store.add_cert(pem_cert)
        try:
            for c in signed['certificates']:
                cert = load_certificate(FILETYPE_ASN1, c.chosen.dump())
                X509StoreContext(store, cert).verify_certificate()
                is_valid = pem_cert.to_cryptography() == cert.to_cryptography()
        except X509StoreContextError as e:
            is_valid = False
    return is_valid

And the actual error I'm getting is this.

AttributeError: module 'lib' has no attribute 'PKCS7_verify'
WhyNotHugo commented 4 years ago

It seems that the certificates need to be loaded into a different class than before, but I'm not quite sure how the key should be loaded.

This is my [adapted] code:

def create_embeded_pkcs7_signature(data: bytes, cert: str, key: str):
    """
    Creates an embedded ("nodetached") PKCS7 signature.

    This is equivalent to the output of::

        openssl smime -sign -signer cert -inkey key -outform DER -nodetach < data
    """

    pkey = crypto.load_privatekey(crypto.FILETYPE_PEM, key)
    signcert = crypto.load_certificate(crypto.FILETYPE_PEM, cert)

    options = [smime.SMIMEOptions.DetachedSignature, smime.SMIMEOptions.Text]
    signed_data = (
        smime.SMIMESignatureBuilder()
        .add_data(data)
        .add_signer(signcert, pkey, hashes.SHA256())
        .sign(smime.SMIMEEncoding.PEM, options)
    )
    return signed_data

It raises:

    def add_signer(self, certificate, key, hash_algorithm):
        if not isinstance(hash_algorithm, hashes.HashAlgorithm):
            raise TypeError("hash_algorithm must be a hash")
        if not isinstance(certificate, x509.Certificate):
>           raise TypeError("certificate must be a x509.Certificate")
E           TypeError: certificate must be a x509.Certificate

I was going to try using load_pem_x509_certificate, but I'm not sure how to load the key itself.

WhyNotHugo commented 4 years ago

Oh, figured it out looking at the tests in that branch :facepalm:

This snippet is working for me now:

from cryptography import x509
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives import smime
from OpenSSL import crypto

def create_embeded_pkcs7_signature(data: bytes, cert: str, key: str):
    """Creates an embedded ("nodetached") PKCS7 signature.

    This is equivalent to the output of::

        openssl smime -sign -signer cert -inkey key -outform DER -nodetach < data
    """

    pkey = serialization.load_pem_private_key(key, None)
    signcert = x509.load_pem_x509_certificate(cert.encode())

    signed_data = (
        smime.SMIMESignatureBuilder()
        .add_data(data)
        .add_signer(signcert, pkey, hashes.SHA256())
        .sign(smime.SMIMEEncoding.Binary, [smime.SMIMEOptions.Binary])

    )
    return signed_data

Some thoughts (from the POV of a consumer of the library, rather than a developer of it):

reaperhulk commented 4 years ago

@WhyNotHugo load_pem_private_key is documented as requiring bytes-like input (https://cryptography.io/en/latest/hazmat/primitives/asymmetric/serialization/#cryptography.hazmat.primitives.serialization.load_pem_private_key). We should probably be enforcing that with our typical bytes-like checks actually, but if you pass a Python3 str to it then you'll get TypeError: from_buffer() cannot return the address of a unicode object. If there's a way to pass non-bytes to any of the APIs that I've missed I'm happy to fix it though!

reaperhulk commented 4 years ago

@alliefitter since you're passing PKCS7_NOVERIFY you're just doing basic sig sanity checks using the embedded cert? I think we'll re-add the PKCS7_verify call as part of this even if I don't have a verification API figured out yet... As a straw man it might look something like:

def smime_verify(smime, certs, verify_chain, no_embedded_certs, option3, ...)
WhyNotHugo commented 4 years ago

You're right, looks like my type annotations above were wrong, but mypy never picked it up. It's all bytes! :+1:

Branch works perfect for me!

kyrofa commented 4 years ago

@reaperhulk +1 from me, it's lovely. You can replace my entire function with this:

def _sign_bytes(cert, key, byte_string):
    return smime.SMIMESignatureBuilder(
        byte_string).add_signer(
            cert, key, hashes.SHA256()
        ).sign(smime.SMIMEEncoding.PEM, (smime.SMIMEOptions.Text, smime.SMIMEOptions.DetachedSignature))
reaperhulk commented 4 years ago

After far too much additional work the PR is now available for review at #5465. The API is pretty close to the testing branch, but take a look at the docs and let me know if there's anything I've missed.

WhyNotHugo commented 4 years ago

Any idea when 3.2 (or 3.1.1?) will be released?

reaperhulk commented 4 years ago

5471 and #5472 mean we'll almost certainly be updating the API a bit before shipping. No timeline as yet, but end of October is possible. For those who encountered issues due to the binding removal the workaround for now is to stay at 3.0, sorry!

gjohary commented 4 years ago

Is there a replacement for crypto._lib.PKCS7_sign now?

I am getting `AttributeError: module 'lib' has no attribute 'PKCS7_sign' when I try to use it. Please advise!!

This is my use case. I am trying to convert some php code for sending safari notifications to python. PHP has some built in methods which are not there for python and I need to run this.

` p12 = load_certificates(config)

Create the memory buffer for the signature

with manifest.open("rb") as file: buffer_in = crypto._new_mem_buf(file.read())

Grab the flags from source

PKCS7_DETACHED = 0x40 PKCS7_BINARY = 0x80

Sign the actual file

pkcs7 = crypto._lib.PKCS7_sign( p12.get_certificate()._x509, p12.get_privatekey()._pkey, crypto._ffi.NULL, buffer_in, PKCS7_BINARY | PKCS7_DETACHED)

Write out the result

buffer_out = crypto._new_mem_buf() crypto._lib.i2d_PKCS7_bio(buffer_out, pkcs7) der = crypto._bio_to_string(buffer_out) with signature.open("wb") as file: file.write(der) `

WhyNotHugo commented 4 years ago

@gjohary Please read the read you are replying to, the answer to why this is happening, and how to fix it is widely explained with multiple examples. See also #5465, which adds a new API.

frennkie commented 4 years ago

@alex @reaperhulk great work on adding sign. I am testing this and can't seem to find a way to add an additional certificate. Is this missing?

The background is that my x509 certificate is from a commercial CA. But - as it's commonly the case - it is not signed directly by the Root CA (instead it's signed from an intermediate/sub CA). So when I send a message to a new contact I would definitely want to include my certificate AND the certificate of the intermediate CA.