OpenSC / libp11

PKCS#11 wrapper library
GNU Lesser General Public License v2.1
310 stars 189 forks source link

Fix regression in latest release causing crashes when EC key is duplicated #470

Closed Jakuje closed 2 years ago

Jakuje commented 2 years ago

Right now, when EC key is duplicated (or retrieved several times from the libp11, either through API or from engine), the references to the internal objects stored in the ex_data is not copied and when the first reference is freed, any other reference later crashes on double-free().

This can be demonstrated simply with the attached regression testcase.

This should be fixed by copying the ex_data to new openssl objects when they are copied so they can be freed together with the objects.

As far as I know, this is not an issue for RSA keys, but I did not get to investigate it further yet.

This is a regression from libp11-0.4.11, where this was not an issue (I think the references were cleaned some time during destructors).

Jakuje commented 2 years ago

Still not sure if the copy is the best thing to do or we should rather just use reference count to this structure similarly as we do with the slot and other structures. In any case, this looks like it is still not yet final as I am still seeing some invalid memory access when running libssh testsuite. But it can be also an issue on the libssh side.

mtrojnar commented 2 years ago

@arisada I'd appreciate your recommendations for the best way to deal with it.

arisada commented 2 years ago

@arisada I'd appreciate your recommendations for the best way to deal with it.

Hi Mike, I appreciate that you value my opinion, but I lack way too much context to give an educated recommendation on this issue, especially on the libssh side where I didn't follow up the libp11 integration.

My generalist pov regarding copy vs refcount:

in favor of copy:

Arguments in favor of refcount are the exact opposite :)

mtrojnar commented 2 years ago

@arisada I strongly believe that you have the same or more experience with our PKCS#11 engine than I have with your libssh library.

Jakuje commented 2 years ago

@arisada I strongly believe that you have the same or more experience with our PKCS#11 engine than I have with your libssh library.

I do not think it depends on the use, if it is in libssh or in any other project that would be using libp11 engine through openssl. I am still investigating if the libssh does not do something stupid or if there is some path I missed when putting together this PR.

Jakuje commented 2 years ago

I have a reproducer for the follow-up crashes in the same branch and some basic understanding of what is going on in libp11, but I am not sure if I will be able to put together something that will work reliably enough soon enough.

https://github.com/Jakuje/libp11/commit/e891689920d82d9fe8590d86dbc1e2875f216f41

The problem is, that the libp11 keeps a list of keys in internal structures of slot (slot->prv, slot->pub). These are the same structures that are assigned to the openssl objects and freed with them, which leads to the above problem, that after loading the key and freeing it, we are addressing freed memory inside of libp11 while trying to load the keys again.

Solution might be keeping references to the key structures or also copying them into/from the slot structures. I started with the former, but did not get far and I am not sure how much time I will have for that in the coming days so I will leave it here if somebody would be interested in debugging this further.

dengert commented 2 years ago

Should we be looking at: https://www.openssl.org/docs/manmaster/man3/EC_KEY_set_ex_data.html says: "All functions with a TYPE of DH, DSA, RSA and EC_KEY are deprecated. Applications should instead use EVP_PKEY_set_ex_data(), EVP_PKEY_get_ex_data() and EVP_PKEY_get_ex_new_index().

We also don't provide CRYPTO_EX_new new_func, CRYPTO_EX_dup dup_func, and CRYPTO_EX_free *free_func functions.

./p11_ec.c: 161 #if OPENSSL_VERSION_NUMBER >= 0x10100002L && !defined(LIBRESSL_VERSION_NUMBER) 162 ec_ex_index = EC_KEY_get_ex_new_index(0, "libp11 ec_key", 163 NULL, NULL, NULL); 164 #else 165 ec_ex_index = ECDSA_get_ex_new_index(0, "libp11 ecdsa", 166 NULL, NULL, NULL); 167 #endif

Maybe we should have use these at lease the CRYPTO_EX_dup *dup_func, and do the same for RSA.

Jakuje commented 2 years ago

Should we be looking at: https://www.openssl.org/docs/manmaster/man3/EC_KEY_set_ex_data.html says: "All functions with a TYPE of DH, DSA, RSA and EC_KEY are deprecated. Applications should instead use EVP_PKEY_set_ex_data(), EVP_PKEY_get_ex_data() and EVP_PKEY_get_ex_new_index().

Note, that this is an issue with both OpenSSL 1.1 and 3.0 so deprecation waiver is not going to help here.

dengert commented 2 years ago

What I was suggesting is not use the EC_KEY or RSA_KEY, but to start using EVP_PKEY and hang the ex_data off of EVP_PKEY.

This could be done for 1.1 and 3.0 and it looks like they may drop legacy support of EC_KEY or RSA_KEY before they drop support for EVP_PKEY.

I am concerned the ordering of OpenSSL events:

The current code allows for using engines for selected keys in their apps. all their apps call opt_provider(o) but this is not the same as keyform. for example: apps/pkeyutl.c allows the use of 2 "FORM" options that can be engines:

 {"peerform", OPT_PEERFORM, 'E', "Peer key format (DER/PEM/P12/ENGINE)"}, 
 {"keyform", OPT_KEYFORM, 'E', "Private key format (ENGINE, other values ignored)"},

If OpenSSL writes their own PKCS11 they will still need these same concepts as the current engines to use the engine/PKCS11 to provide the parameters. And it is not clear how they could do the same with a provider. As providers appear to be applied to all operations, and not selectively for specific keys.

Where we see problems is they assume engines can be used like a provider and they are using it as a default, thus the need to defer the loading of the engine in #457, #460 and others.

Another approach might be that when they try and load the engine as a default provider, is to not let libp11 be load at that time. This may not be possible without changes to OpenSSL, but they might be open to some simple change as they will face the same problem with their own PKCS11.

Jakuje commented 2 years ago

What I was suggesting is not use the EC_KEY or RSA_KEY, but to start using EVP_PKEY and hang the ex_data off of EVP_PKEY.

I am not sure how this would work with the applications using the low-level API for RSA/EC operations with (EVP_PKEY_get1_RSA()) and not the high-level one (EVPPKEY*). Given that there are still bugs like openssl/openssl/pull/16624 in OpenSSL 3.0, which prevent using EC keys through the new API. I would rather try to fix existing code than complicating it with new stuff.

dengert commented 2 years ago

I am not sure how this would work with the applications using the low-level API for RSA/EC operations with (EVP_PKEY_get1_RSA()) and not the high-level one (EVPPKEY*).

In our case the application is libp11. So we could convert the libp11 code for OpenSSL 3 to not use the older low-level API. AFAIK, libp11and known cards only support Uncompressed EC points. I don't think that is stopping us.

And I agree the OpenSSL conversion for engines vs providers is still notworking as expected, and when will OpenSSL have their own PKCS11, and will it replace libp11, or not.

https://www.openssl.org/docs/man3.0/man3/EVP_PKEY_get_params.html says to use this. (And it may have bugs.)