parallaxsecond / rust-cryptoki

Rust wrapper for the PKCS #11 API, Cryptoki
https://docs.rs/cryptoki/
Apache License 2.0
77 stars 61 forks source link

RSA OAEP interface is unsound #107

Closed jhagborgftx closed 1 year ago

jhagborgftx commented 2 years ago

The following code compiles without error or warning:

use cryptoki::context::{CInitializeArgs, Pkcs11};
use cryptoki::mechanism::{Mechanism, MechanismType};
use cryptoki::mechanism::rsa::{PkcsMgfType, PkcsOaepParams, PkcsOaepSourceType};
use cryptoki::object::Attribute;
use cryptoki::session::UserType;
use std::error::Error;

const PKCS11_LIB_PATH: &str = "/usr/lib/softhsm/libsofthsm2.so";
const SLOT_ID: u64 = 1359578051; // from softhsm2-util
const PIN: &str = "testpin";

fn main() -> Result<(), Box<dyn Error>> {
    let mut context = Pkcs11::new(PKCS11_LIB_PATH)?;
    context.initialize(CInitializeArgs::OsThreads)?;
    let session = context.open_rw_session(SLOT_ID.try_into()?)?;
    session.login(UserType::User, Some(PIN))?;

    let pub_key_template = [Attribute::ModulusBits(2048.into())];
    let (pubkey, _privkey) = session.generate_key_pair(&Mechanism::RsaPkcsKeyPairGen,
                                                       &pub_key_template, &[])?;

    let encrypt_mechanism = Mechanism::RsaPkcsOaep(PkcsOaepParams {
        hash_alg: MechanismType::SHA1,
        mgf: PkcsMgfType::MGF1_SHA1,
        source: PkcsOaepSourceType::DATA_SPECIFIED,
        source_data: 0xBADC0DE as _, // uh oh!
        source_data_len: 1.into(),
    });

    session.encrypt(&encrypt_mechanism, pubkey, b"Hello, world!")?;

    Ok(())
}

This is almost certainly UB (although in this example, libsofthsm returns CKR_BAD_ARGUMENTS, only because it does not support non-null source_data). Rust code should not expose an interface that may dereference a user-supplied raw pointer, without marking that interface unsafe.

I don't care much about this parameter, but the same problem exists for other, not-yet-supported mechanisms whose parameters contain pointers (notably, AES-GCM and AES-CCM). I do care about safe interfaces for those, and it would make sense to treat them all consistently.

Possible solutions

No matter what, we can't let the user supply arbitrary pointers, without making them call an unsafe function. The internals of PkcsOaepParams should be private, with one or more safe constructors (and possibly an unsafe constructor taking a raw pointer).

Don't support source_data at all

I don't see any tests for this feature. libsofthsm2 doesn't support it. Is anyone using this thing anyway? Maybe we just force it to be null.

On the other hand, we'll need to figure out the same thing for AES-GCM and AES-CCM, where we do care about non-null parameters.

Make PkcsOaepParams own its source data

This is ergonomic, although inefficient if source_data is large and must be shared. The spec allows source_data to be huge (as large as the hash function supports), but again, is anyone doing that?

Add a lifetime parameter to Mechanism and PkcsOaepParams

This allows for more flexibility, although it clutters the interface of anything that uses a Mechanism. Luckily, any variant which does not use the lifetime parameter can be inferred 'static.

This is what I personally would like most, but I understand it's an invasive change.

Provide an unsafe constructor taking a pointer and length

This isn't great if it's the only way, but it does provide a useful backdoor if we go with forcing ownership.

wiktor-k commented 2 years ago

Okay, this is definitely a problem. First things first I'd mark the struct as #[deprecated] linking to this issue to give people using it a heads-up that it's going to change.

Yeah, the interface looks half-low-level half-high level. I'm wondering how much code will need to change for the "lifetime parameter" scenario although I'm OK with the "own source data" too.

@hug-dev, @ionut-arm, @gowthamsk-arm what do you think?

ionut-arm commented 2 years ago

Hey, thanks for reporting this! I agree, it's definitely a problem, don't know how we hadn't noticed it before.

First things first I'd mark the struct as #[deprecated]

Don't know if that's necessary, really - if we break/change the interface we can just bump the version number appropriately.

I also agree about the assessment - ideally I would've liked us to take ownership of that data and generate our own pointer from it when necessary, but that might be inefficient. However, if we do go for a reference, the new lifetime parameter isn't the only problem with that approach - we would then be inconsistent with a lot of other mechanisms which take ownership of buffers. Perhaps we could switch to references everywhere, or at least in places with "unbounded" buffers.

The Mechanism struct is probably a low-impact place to have lifetimes because I wouldn't expect them to be persisted between calls/operations - but perhaps I'm wrong on that. Overall, I'd lean towards using a reference.

Thoughts?

jhagborgftx commented 2 years ago

On Thu, 2022-11-24 at 09:44 -0800, Ionuț Mihalcea wrote:

I also agree about the assessment - ideally I would've liked us to take ownership of that data and generate our own pointer from it when necessary, but that might be inefficient. However, if we do go for a reference, the new lifetime parameter isn't the only problem with that approach - we would then be inconsistent with a lot of other mechanisms which take ownership of buffers. Perhaps we could switch to references everywhere, or at least in places with "unbounded" buffers.

The only data we actaully own already are small, fixed-sized structs:

I think these should stay owned! There's nothing to be gained by keeping references to these structs. However, PkcsOaepParams and Ecdh1DeriveParams both contain raw pointers to variable-length data. The closest safe approximation to this is a reference to a slice.

For the sake of consistency, we may as well discuss how we'd handle some other (not yet supported) mechanisms:

With this in mind, I'd propose the following rule: A Mechanism owns its parameter, whether that's a structure or a fixed-sized buffer. If that structure contains pointers and lengths, replace them with references to slices. If that structure has embedded arrays, keep that as [u8; N]. This is consistent with all currenlty supported mechanisms.

This assumes no mechanisms take variable-length buffers as their parameter directly. A quick scan through < https://docs.oasis-open.org/pkcs11/pkcs11-curr/v3.0/os/pkcs11-curr-v3.0-os.html

seems to confirm that is the case. If one does pop up, I wouldn't feel bad about making an exception for it.

The Mechanism struct is probably a low-impact place to have lifetimes because I wouldn't expect them to be persisted between calls/operations - but perhaps I'm wrong on that. Overall, I'd lean towards using a reference.

That's definately true for the code I'm working on -- mechanisms only live for one or two operations each, always within a single function.

In any case, if someone is persisting Mechanisms today, they're either fine with using a 'static lifetime, or are already doing some reasoning about raw pointers that could be helped by the lifetime.

-- James Hagborg

ionut-arm commented 1 year ago

Hey, thanks for the reply!

AES-GCM takes a struct with pointers to IV and AAD. These can both be arbitrarily long (up to 2^32 - 1 bytes). It would make sense for these to be slices as well.

I don't think that's true about IV in version 2.40 of the spec:

ulIvLen length of initialization vector in bytes. The length of the initialization vector can be any number between 1 and 256. 96-bit (12 byte) IV values can be processed more efficiently, so that length is recommended for situations in which efficiency is critical.

Though looking at v3, I can see where you got the value from (see comment below about versioning). It seems fairly reasonable to treat IVs as fairly limited in size and thus take full ownership of them, whereas for AAD or other large, variable-length structures we could just take a reference and use that.

One thing I noticed is that you seem to be referring to version 3 of the spec:

A quick scan through https://docs.oasis-open.org/pkcs11/pkcs11-curr/v3.0/os/pkcs11-curr-v3.0-os.html

While we use version 2.40. A peek through our READMEs made me realize we don't explicitly advertise this! So we definitely need to make it more clear. Did you want to move to v3.0 of the spec? It seems like a fairly big jump, not sure how to handle it easily - perhaps gating it behind a Cargo feature - but that's a separate discussion.

Anyway, your approach in the open PR seems pretty good to me, happy to go down that route.

jhagborgftx commented 1 year ago

On Mon, 2022-12-05 at 03:54 -0800, Ionuț Mihalcea wrote:

Hey, thanks for the reply!

AES-GCM takes a struct with pointers to IV and AAD. These can both be arbitrarily long (up to 2^32 - 1 bytes). It would make sense for these to be slices as well.

I don't think that's true about IV in version 2.40 of the spec:

ulIvLen length of initialization vector in bytes. The length of the initialization vector can be any number between 1 and 256. 96-bit (12 byte) IV values can be processed more efficiently, so that length is recommended for situations in which efficiency is critical.

Though looking at v3, I can see where you got the value from (see comment below about versioning). It seems fairly reasonable to treat IVs as fairly limited in size and thus take full ownership of them, whereas for AAD or other large, variable-length structures we could just take a reference and use that.

That works, but I still think we should use a slice because

After implementing a few more mechanisms, I'm really liking the "match what the C struct does" approach of either using &[u8] or [u8; N] everywhere.

One thing I noticed is that you seem to be referring to version 3 of the spec:

A quick scan through https://docs.oasis-open.org/pkcs11/pkcs11-curr/v3.0/os/pkcs11-curr-v3.0-os.html

While we use version 2.40. A peek through our READMEs made me realize we don't explicitly advertise this! So we definitely need to make it more clear. Did you want to move to v3.0 of the spec? It seems like a fairly big jump, not sure how to handle it easily - perhaps gating it behind a Cargo feature - but that's a separate discussion.

My bad! I'm not too familiar with what's changed between them, I've just been using the 3.0 spec for reference, assuming it was essentially 2.4 + features - depreciations.

jhagborgftx commented 1 year ago

Ok, after a bit more thinking, I think there is another issue with these two mechanisms. We define a #[repr(C)] struct manually, which means we get the padding wrong on Windows! We would avoid this if we used #[repr(transparent)] over the corresponding structs in cryptoki- sys. That's the approach I'm using for AEAD, and I'll update this PR to matche.