wbond / asn1crypto

Python ASN.1 library with a focus on performance and a pythonic API
MIT License
331 stars 140 forks source link

External signing of CSR's and Certificates #6

Open doc-hex opened 8 years ago

doc-hex commented 8 years ago

I would like to keep my private key in an external HSM. However, there is no way to use CSRBuilder or CertificateBuilder.build without providing a private key which is then used by oscrypto. It would be great if a function could be provided (either to build() or as a property of the object) which takes the data to be signed (ie. the bytes of the serialized cert/csr) and returns just the bytes for the signature. That custom function would probably know how to hash it, but perhaps that would be an argument as well.

BTW: Love your library, first class code!

wbond commented 8 years ago

This sounds pretty reasonable to me. It could also be useful if someone wants to use these projects with other crypto libraries (such as https://github.com/pyca/cryptography).

That said, I don't currently have any experience dealing with PKCS#11 or hardware devices. Mind working with me through spec'ing out a callback API for creating and verifying signatures? If you have experience in this area, I would potentially also be interested in your advice related to hardware devices that may be of reasonable cost for me to purchase for development and testing. (Mostly to get more familiar with the tech and common interactions with them.)

It seems this general issue would probably affect:

certbuilder, csrbuilder, crlbuilder and ocspbuilder would need functionality where the signature was generated for a byte string. certvalidator would need the ability to have a signature verified.

You raise a good point about possibly needing the data to be pre-hashed.

Perhaps in the .build() methods of the various builders, allow the user to pass a unicode string of the signature type ("rsa", "dsa", "ecdsa") and callable that would have a signature such as:

def make_sig(hash_algorithm, data):
    """
    Create a signature

    :param hash_algorithm:
        A unicode string of "sha1", "sha256", "sha384", "sha512" indicating what hash
        algorithm should be used to hash the data

    :param data:
        A byte string of the data to hash and sign

    :return:
        A byte string of the signature
    """

Originally I was thinking the callback should return the signature type, but that is embedded in the data being signed, so it needs to be known before the signature is created.

doc-hex commented 8 years ago

Maybe this new feature belongs more as a property of private keys. So to use this new feature, you'd derive a new subclass of private key, rather than pass a function in. You'd have to refactor the private key object being used now, so that message signing is a member function on that.

It's possible to make a software test jig for ECDSA using the building blocks found in python-ecdsa. Sorry I don't know enough about PKCS#11 and similar API's to offer any good answers about prototyping tools. I'm sure a "PGP card" can do it, but they are bear to work with and rather hard to get.

wbond commented 8 years ago

You originally mentioned having an HSM that you use for your private key. How do you interface with that – is there a Python package for it? Is it like a Yubikey?

I mentioned PKCS#11 since my understanding is that was the standard way to interact with hardware-based crypto engines. Beyond that I don't really know much of anything about this area.

It is possible that we could create an abstract base class that oscrypto.asymmetric.PrivateKey implements for signing and verification. Then it would be possible to create alternative implementations. That said, it would seem weird to have users implement an interface that is part of oscrypto when none of the oscrypto functionality would be used. Perhaps instead it makes sense to provide an "interface" package that oscrypto and other shim packages could implement.

doc-hex commented 8 years ago

It might not actually be an HSM, just a situation where I don't trust the cert machine with private keys. There is no package to talk to it at this point, and it won't be general purpose.

Yes, PKCS#11 is the API standard (gag) for HSM's. It's terrible and that's all I know too.

It does seem weird to put that ABC in oscrypto but that's not the end of the world, since that package remains a dependency for the others. An alternative would be to define a mixin that specifies the calls you need to implement to be a private-key signing type object. That's basically like an interface spec in other languages when the mixin is abstract and has no implementation. But you still have the same packaging problem, because you'd want to reference the mixin from both packages.

joernheissler commented 8 years ago

The OpenPGP card is not hard to buy and it's also not hard to implement its interface in pure python. It's available from http://shop.kernelconcepts.de/ or https://shop.nitrokey.com/shop/product/nitrokey-pro-3 and probably other sources. Specs for its protocol is here: https://g10code.com/docs/openpgp-card-2.1.pdf

There's already http://pyscard.sourceforge.net/ so you can directly send the protocol commands from python without having to know anything about the underlying reader hardware etc. Implementing the protocol isn't very hard, really. A fully featured implementation needs no more than 500 LoC.

So it is possible to avoid PKCS#11. But the problem is, of course, that one needs to write an implementation for every kind of HSM. The specs for openpgp smartcard is open. But what about the super expensive commercial HSMs? Are those open too?

As an example, OpenPGP Card 2.1 only supports RSA and only PKCS1.5 padding. It needs as input the DER encoding of a DigestInfo structure and returns the byte encoding of the signature (same size as rsa modulus). Other HSMs may support ECDSA or PSS or other signature schemes.

wbond commented 8 years ago

I ended up buying a yubikey and a smart card device back in late 2015. Unfortunately I haven't really had any "free" time to pursue investigating this whole area much further.

I imagine if this were to move forward, someone with experience in Python and various security hardware devices would need to start contributing to the project.

Perhaps eventually I may try to add support for some more "popular" hardware devices that are common among developers (Yubikey, etc), but that is unlikely to happen in the next six months with my current schedule.

elad commented 7 years ago

Azure has Key Vault, which is like HSM as a service. It has a RESTful API, and provides primitives for creating keys and signing blobs with those keys, among others.

Signing data looks like this:

# access_token was procured via standard API call
headers = { 'Authorization': 'Bearer {}'.format(access_token) }

# API payload
data = { 'alg': 'RS256', 'value': some_base64_blob }

# API call, 1 is the key version
requests.post('https://my-key-vault.vault.azure.net/keys/my-key/1/sign?api-version=2016-10-01', json=data, headers=headers)

So it seems to me that the intuition for the abstraction expressed above in make_sig() is correct. I also see a similar logic exists, albeit internally, in ocspbuilder/init.py:

if responder_private_key.algorithm == 'rsa':
    sign_func = asymmetric.rsa_pkcs1v15_sign
elif responder_private_key.algorithm == 'dsa':
    sign_func = asymmetric.dsa_sign
elif responder_private_key.algorithm == 'ec':
    sign_func = asymmetric.ecdsa_sign

if not is_oscrypto:
    responder_private_key = asymmetric.load_private_key(responder_private_key)
signature_bytes = sign_func(responder_private_key, response_data.dump(), self._hash_algo)

It also makes sense to me that this is a property of the key rather than just a hook. When working with an HSM, it's possible that the entire flow changes a bit, so not just sign, but even the initial load key primitive is different/nonexistent.

joernheissler commented 6 years ago

I started writing an abstraction/interface for private keys (rsa, dsa, ec, ed, …) and hash functions. The interfaces won't have artificial limits on what can be done, even if some operations are insecure or silly.

The actual crypto operations are implemented by optional backend libraries; there would be a small wrapper for each backend to make it compatible with my interface. There will be (optional) implementations for various paddings (e.g. PSS) in case a backend only supports a more basic operation.

It's supposed to be kind of like PKCS#11, but all python instead of C. Using "asyncio" where it makes sense and other modern python features like "typing".

Some backends I can think of:

There would only be few hard dependencies on other python packages, e.g. attrs and asn1crypto (for PKCS#1 1.5).

Other libraries could use mine to create digital signatures. E.g. build a TBSCertificate with asn1crypto, dump the DER, sign it, build the Certificate.

I'm not sure yet if this will be useful to others, it's currently more a plaything for myself and not yet ready to be made public.

If someone else worked on something similar in the meantime, please tell me so I can take a look before inventing my own wheel.

joernheissler commented 5 years ago

I made some progress on my idea (see previous post): https://github.com/joernheissler/cryptokey It's still in an early phase and isn't too useful yet when compared to existing libraries. But signing with RSA works. More backends need to be added, better documentation, etc.

vicpara commented 3 years ago

Is there any update on this? For the short term, is there any bespoke code as a work around ? Thanks.

joernheissler commented 3 years ago

Asn1crypto doesn't do any signing. Perhaps this ticket should be closed?

Changes need to be done in whatever software you're using, e.g. CSRBuilder. Replace the signature operation with your own code to sign the data some other way. It shouldn't be more than a few changed lines.

I wish I had more time to work on https://github.com/joernheissler/cryptokey. It kind of works and I already use it, but it's unstable and will see at least one bigger API change. Also, my lib is using async everywhere because hardware tokens and cloud HSMs are, by nature, async. CSRBuilder & co are sync.

If you tell me more about the hardware/cloud token you're using, I'll add it to my todo list and might eventually implement a binding to it in pure-python.

vicpara commented 3 years ago

I understand. Thanks for your quick reply. I am using Google Cloud Platform's HSM / KMS hosted key pair for document signing. As it stands now they don't provide any CSR operations for the keys generated inside their HSM/KMS.

I already changed the builder to inject an external signature but I am not very confident there will not be any discrepancies when the CA looks at it or even worse, after they sign it and I need to start using it.

joernheissler commented 3 years ago

I already changed the builder to inject an external signature but I am not very confident there will not be any discrepancies when the CA looks at it or even worse, after they sign it and I need to start using it.

A CA cannot know how a signature was created. Either the signature is correct or it's not. You can check with openssl req -verify -in foo.csr

As it stands now they don't provide any CSR operations for the keys generated inside their HSM/KMS.

Of course not! All they supply is a single sign operation: https://cloud.google.com/kms/docs/reference/rest/v1/projects.locations.keyRings.cryptoKeys.cryptoKeyVersions/asymmetricSign

You'll have to pass in the hash of your data and they return a signature. GPC has no way of knowing if this is for a CSR or anything else.

vicpara commented 3 years ago

A CA cannot know how a signature was created. Either the signature is correct or it's not. You can check with openssl req -verify -in foo.csr

My worries lies mostly in the CSR metadata like extensions, attributes, version compatibility that need to be appropriate for document signing not for TSL and how subject attributes are encoded. For example openssl has some sort of glitch where the emailAddress is concatenated to the CN field if the user specifies it after the CN field in subj like so: '/C=US/ST=.../CN=.../emailAddress=hr@example.com' . If one specifies it before, then '/emailAddress=hr@example.com/C=US/ST=...' the it gets correctly set in its own field. This is just one example of how a CSR can be poorly generated. By the way, I noticed csrbuilder suffers from the same issue or maybe it's intentionally done so?

These were the kind of issues that i hoped to avoid.

Thanks for your answer.

joernheissler commented 3 years ago

You can use asn1crypto directly to create almost any CSR you desire. I think I never used CSRBuilder.