feature: implement the RustCrypto KEM API for ClassicMcEliece

dkales commented 2 years ago

This is the first attempt for the RustCrypto KEM API.

A few growing pains:

KEM still uses generic_array since const generics are not far enough yet, is a bit of a hassle with typenum.
Just storing the large PK in a wrapper struct also has enormous stack usage, nothing to be done for no_std, but there should be an option to box the KeyPair/Ciphertexts.

Any comments @faern @prokls ?

faern commented 2 years ago

I think that this PR is currently not breaking the API, right? Was that so we can get something out faster? The main branch in this repository has already broken the API compared to the last release. So the next release will be breaking anyway. So I think we better clean as much as possible up at once instead. But that's just my opinion.

There will be yet another breaking release once the crate can support all variants at the same time instead of having feature flags also. Unless we want to roll that into v2 as well. Which is a bigger job, but doable!

dkales commented 2 years ago

I'd also be partial to yanking 1.0, since it is hard to use correctly at all and publishing this under 1.0.1. @prokls any input on this?

Maybe a simple option would be to have a second keypair_boxed, that returns Boxed sk,pk, given alloc? I could also see the normal API using these types, and maybe also having boxed variants on alloc?

faern commented 2 years ago

I'd also be partial to yanking 1.0, since it is hard to use correctly at all and publishing this under 1.0.1.

1.0.1 is already published. So 1.0.2 is the lowest we can go. But personally I think that's a bit dirty. To completely break the API and make it look like some kind of patch release. There is no known bug/issue with the 1.0.x branch, except it being hard to use and should not be used :joy:. I don't see any reason against going 2.0.0 with the next release. We can yank 1.0.x anyway, but I don't think we should release the brand new API as a patch release to 1.0.

Maybe a simple option would be to have a second keypair_boxed, that returns Boxed sk,pk, given alloc?

I definitely don't like out parameters. But I'm also not a fan of this suggestion. It's simply less flexible. It only allows stack and heap allocated buffers. Consider this potential way of storing your buffer as a global static:

static BUF: Mutex<[u8; 1024]> = Mutex::new([0; 1024]);

fn main() {
    let mut public_key = BUF.lock().unwrap();
    keypair(&mut public_key);
    // do something with `public_key`
}

The best would IMHO be if we could better encode the difference between storage to be used for key material and a filled in key. To make it harder to accidentally use an uninitialized buffer as key material. This should be a fairly common problem to solve for out parameters, so I'm surprised if there is no almost-ready solution for this. But otherwise I guess it's some boilerplate with traits etc. Where keypair takes in a buffer type and returns the key type (which wrap that borrowed buffer type)

faern commented 2 years ago

I could also see the normal API using these types, and maybe also having boxed variants on alloc?

I don't think the "boxed variant" should be struct PublicKey(Box<[u8; ..]>). But rather they should be Box<PublicKey>. The heap allocation should not be in the type, rather you should put the entire type on the heap.

This library should not define a special heap type. That should be on the user to construct by just combining existing types and heap allocation methods. I don't think this library should get involved in how the allocation happens, at all. It should just take in a mutable reference to where to put the data.

dkales commented 2 years ago

I'd kind of hate if one could create uninitialized objects even with the new API. I think Out-Parameters are just not used if at all possible most of the time and I also don't like them.

I also don't know how well the creation of boxed Arrays or Boxed array-wrapper structs works in practice right now, I think there were still some cases where a copy was created on the stack as an intermediate step when boxing an array, will have to look into that in more detail.

faern commented 2 years ago

I think Out-Parameters are just not used if at all possible most of the time and I also don't like them.

Maybe I misunderstood you. But the standard library std::io::Read trait uses out parameters.

I really don't like them either. But the opposite is worse IMO.

I also don't know how well the creation of boxed Arrays or Boxed array-wrapper structs works in practice right now, I think there were still some cases where a copy was created on the stack as an intermediate step when boxing an array, will have to look into that in more detail.

Only in completely unoptimized builds. When you turn on compiler optimizations these copies are removed is my experience.

faern commented 2 years ago

I think there were still some cases where a copy was created on the stack

How will you avoid that? Even with PublicKey(Box<[array]>) you need to init the array first and then it's moved to the heap. But it's only a problem on non optimized builds. So I don't think it matters a whole lot in this case.

faern commented 2 years ago

What I have in mind is something like the following. We still take a mutable buffer as the argument to the function. Nothing changes there! But the return type is changed from nothing to typed keys. Then we implement the appropriate traits and methods on those types (like you have already done). The difference is that those types don't contain the owned array. They just contain references to the key material. That way they can be constructed from a versatile set of buffer sources. And only this crate can instantiate the key types.

use std::error::Error;
use std::sync::Mutex;

pub const PUBLIC_KEY_SIZE: usize = 1024;
pub const SECRET_KEY_SIZE: usize = 9876;
pub const CIPHERTEXT_SIZE: usize = 1235;

pub struct PublicKey<'a>(&'a [u8; PUBLIC_KEY_SIZE]);
pub struct SecretKey<'a>(&'a [u8; SECRET_KEY_SIZE]);
pub struct Ciphertext<'a>(&'a [u8; CIPHERTEXT_SIZE]);

impl Decapsulator<Ciphertext> for SecretKey {
    /*...*/
}
impl EncappedKey for Ciphertext {
    /*...*/
}

fn keypair<'a>(
    secret_key_buffer: &'a mut [u8; SECRET_KEY_SIZE],
    public_key_buffer: &'a mut [u8; PUBLIC_KEY_SIZE],
) -> Result<(SecretKey<'a>, PublicKey<'a>), Box<dyn Error>> {
    // Compute the keys
    secret_key_buffer[0] = 1;
    public_key_buffer[0] = 1;

    Ok((SecretKey(secret_key_buffer), PublicKey(public_key_buffer)))
}

// Demonstrate how versatile it becomes to use the above function:
fn main() -> Result<(), Box<dyn Error>> {
    let mut stack_secret_buffer = [0u8; SECRET_KEY_SIZE];
    let mut stack_public_buffer = [0u8; PUBLIC_KEY_SIZE];
    let (stack_privkey, stack_pubkey) = keypair(&mut stack_secret_buffer, &mut stack_public_buffer)?;

    let mut heap_secret_buffer = Box::new([0u8; SECRET_KEY_SIZE]);
    let mut heap_public_buffer = Box::new([0u8; PUBLIC_KEY_SIZE]);
    let (heap_privkey, heap_pubkey) = keypair(&mut heap_secret_buffer, &mut heap_public_buffer)?;

    static STATIC_SECRET_BUFFER: Mutex<[u8; SECRET_KEY_SIZE]> = Mutex::new([0; SECRET_KEY_SIZE]);
    static STATIC_PUBLIC_BUFFER: Mutex<[u8; PUBLIC_KEY_SIZE]> = Mutex::new([0; PUBLIC_KEY_SIZE]);
    let mut static_secret_buffer_lock = STATIC_SECRET_BUFFER.lock().unwrap();
    let mut static_public_buffer_lock = STATIC_PUBLIC_BUFFER.lock().unwrap();
    let (static_privkey, static_pubkey) = keypair(&mut static_secret_buffer_lock, &mut static_public_buffer_lock)?;

    Ok(())
}

dkales commented 2 years ago

I think there were still some cases where a copy was created on the stack

How will you avoid that? Even with PublicKey(Box<[array]>) you need to init the array first and then it's moved to the heap. But it's only a problem on non optimized builds. So I don't think it matters a whole lot in this case.

There will be a new box keyword that should fix this in the future, but I agree it does not matter much right now.

Regarding your proposal, I had something similar in mind as well, however the ergonomics of that solution are also not great, since you may need to keep/pass both the buffer and the Key object around. It would be nice to maybe re-use the network buffer directly as the backing storage for the public keys, since there is no additional pk-compression like in other schemes. (The above would allow that, although one has to try_into a slice into a fixed-siced array ref.)

A different variant would be to have the above types, but wrap the inner storage in std::borrow::Cow. I don't know if that works nicely with arrays (in that the cow enum not as large as the array itself due to the Owned variant), so it just may be better (in general) that the methods take slices instead of fixed sized arrays and error on invalid sizes.

faern commented 2 years ago

There will be a new box keyword that should fix this in the future, but I agree it does not matter much right now.

Are there any updates on this? That keyword has been very experimental since Rust 1.0 and last time I read about it it did not sound like it would ever be stabilized, rather be removed.

dkales commented 2 years ago

There will be a new box keyword that should fix this in the future, but I agree it does not matter much right now.

Are there any updates on this? That keyword has been very experimental since Rust 1.0 and last time I read about it it did not sound like it would ever be stabilized, rather be removed.

Seems like you are right based on https://github.com/rust-lang/rust/issues/49733

faern commented 2 years ago

The Cow approach is interesting! However, the Cow type is not in core, it requires at least alloc. So if the end goal is to make the crate no_std then this has to be opt in. The question is if the crate actually aims to be no_std? It does make things more complicated, so unless there are reasonable use cases I don't see a good reason to strive for it. Will there be embedded applications that does not have allocation support but that are willing to handle cryptography with several hundred kilobytes keys?

dkales commented 2 years ago

So which of these options would you (@faern @prokls) prefer:

Current Proposal (large structs directly wrapping arrays)
Structs just heap allocating large objects themselves
faerns proposal above (Structs wrapping references to backing arrays)
Structs wrapping references to slices
Structs wrapping Cows of slices (Cow is in alloc so also not no_std)

Personally I feel structs wrapping arrays or slices is the best compromise to address most of the pain points.

faern commented 2 years ago

I'd like to avoid compromises if possible :) If we can figure out something that:

Does not force huge stack usage on people
Has nice types instead of just arrays
Allows 'static keys on the stack if the library user would like to
Has opt-in heap storage with easy to pass around 'static keys
Has no_std support

That would be golden. What about below. You need to pass in a storage object to the functions. But instead of being just a &mut [LARGE ARRAY] it's more flexible thanks to a trait. The trait allows you to pass in owned or borrowed data, both on the stack and (optionally) the heap!

use std::mem::{size_of, size_of_val};

pub const PUBLIC_KEY_SIZE: usize = 98765;

pub struct PublicKey<S: KeyStorage<PUBLIC_KEY_SIZE>>(S);

impl<S: KeyStorage<PUBLIC_KEY_SIZE>> PublicKey<S> {
    /// Helper to move a key backed in any possible way to the stack
    pub fn to_owned(&self) -> PublicKey<[u8; PUBLIC_KEY_SIZE]> {
        PublicKey(*self.0.as_ref())
    }

    /// Helper to move a key backed in any possible way to the heap
    #[cfg(feature = "alloc")]
    pub fn to_heap(&self) -> PublicKey<Box<[u8; PUBLIC_KEY_SIZE]>> {
        PublicKey(Box::new(*self.0.as_ref()))
    }
}

pub trait KeyStorage<const SIZE: usize> {
    fn as_ref(&self) -> &[u8; SIZE];
    fn as_mut(&mut self) -> &mut [u8; SIZE];
}

impl<const SIZE: usize> KeyStorage<SIZE> for [u8; SIZE] {
    fn as_ref(&self) -> &[u8; SIZE] {
        self
    }

    fn as_mut(&mut self) -> &mut [u8; SIZE] {
        self
    }
}

impl<const SIZE: usize> KeyStorage<SIZE> for &mut [u8; SIZE] {
    fn as_ref(&self) -> &[u8; SIZE] {
        self
    }

    fn as_mut(&mut self) -> &mut [u8; SIZE] {
        self
    }
}

#[cfg(feature = "alloc")]
impl<const SIZE: usize> KeyStorage<SIZE> for Box<[u8; SIZE]> {
    fn as_ref(&self) -> &[u8; SIZE] {
        self
    }

    fn as_mut(&mut self) -> &mut [u8; SIZE] {
        self
    }
}

fn keypair<'a, S: KeyStorage<PUBLIC_KEY_SIZE> + 'a>(mut public_key_storage: S) -> PublicKey<S> {
    // Compute the key and fill in the buffer
    let key_buffer = public_key_storage.as_mut();
    key_buffer[0] = 1;

    PublicKey(public_key_storage)
}

fn main() {
    // If we give the ownership of the array we get it back owned on the stack.
    // Takes large stack space, but the lifetime is 'static, so it can be passed around
    let mut stack_pubkey_buffer = [0u8; PUBLIC_KEY_SIZE];
    let pubkey = keypair(stack_pubkey_buffer);
    assert_eq!(size_of_val(&pubkey), PUBLIC_KEY_SIZE);
    std::thread::spawn(move || drop(pubkey));

    // If we lend the buffer to `keypair` we don't eat up our stack, but the returned
    // type has a lifetime bound to the buffer, so it can't be passed around as smoothly.
    let pubkey = keypair(&mut stack_pubkey_buffer);
    assert_eq!(size_of_val(&pubkey), size_of::<usize>());
    //std::thread::spawn(move || drop(pubkey)); <- Does not compile
    // However, we can always convert it into an owned version of itself
    let stack_pubkey = pubkey.to_owned();
    std::thread::spawn(move || drop(stack_pubkey));
    drop(pubkey);

    #[cfg(feature = "alloc")]
    {
        // If we allocate the buffer on the heap and transfer ownership to `keypair`, it
        // can use the heap buffer and return the ownership of the same data without cloning
        // or moving. The resulting `PublicKey<Box<...>>` is 'static
        let heap_pubkey_buffer = Box::new([0u8; PUBLIC_KEY_SIZE]);
        let pubkey = keypair(heap_pubkey_buffer);
        assert_eq!(size_of_val(&pubkey), size_of::<usize>());
        std::thread::spawn(move || drop(pubkey));
    }

    #[cfg(feature = "alloc")]
    {
        // We can even use the heap stored buffer in a borrowing way
        let mut heap_pubkey_buffer = Box::new([0u8; PUBLIC_KEY_SIZE]);
        let pubkey = keypair(&mut *heap_pubkey_buffer);
        assert_eq!(size_of_val(&pubkey), size_of::<usize>());
        // std::thread::spawn(move || drop(pubkey));  <- Does not compile
        drop(pubkey);
    }
}

dkales commented 2 years ago

This seems quite nice, I guess I'll try to integrate this in this PR later today.

Maybe also exposing a second keypair method without parameters on alloc that just returns Box-backed PK/SK for ease of use is a good idea as well, since most of the time users probably wont care about the memory layout.

Since the ciphertext (<=240 bytes) and shared secret (32 bytes) are small anyway, it is probably ok to not do this for them, but keep them as arrays or generic_arrays.

dkales commented 2 years ago

This sadly not that compatible with the KEM API:

    impl<S: KeyStorage<CRYPTO_PUBLICKEYBYTES>> EncappedKey for Ciphertext {
        type EncappedKeySize = CryptoCiphertextBytesTypenum;

        type SharedSecretSize = typenum::U32;

        type SenderPublicKey = ();

        type RecipientPublicKey = PublicKey<S>; // <-- this is not constrained in the impl

        fn from_bytes(bytes: &GenericArray<u8, Self::EncappedKeySize>) -> Result<Self, kem::Error> {
            let mut data = [0u8; CRYPTO_CIPHERTEXTBYTES];
            data.copy_from_slice(bytes.as_slice());
            Ok(Ciphertext(data))
        }
    }

Since it wants the public key to be a concrete type, we would also need the Ciphertext to be generic over the public key it came from, even though it does not really matter internally... I guess just making Ciphetext into

struct Ciphertext<S: KeyStorage<CRYPTO_PUBLICKEYBYTES>> {
    data: [u8; CRYPTO_CIPHERTEXTBYTES],
    marker: PhantomData<S>
}

would work, but this also does not seem that nice.

faern commented 2 years ago

Would you be able to push your WIP branch somewhere? So I can play with it?

dkales commented 2 years ago

@faern see the latest commit, I tried it with the PhantomData workaround, and from a user perspective it is not noticeable at all if you do not have to name the concrete Ciphertext type.

The example in the Readme.md covers all the use cases of your comment above (and is doc-tested).

(EDIT: I guess the Readme test still stack-overflows on the github-runners, but it works locally, I guess the default stack size of those runners is configured lower).

faern commented 2 years ago

I now question whether or not we actually want to allow passing the huge public keys on the stack at all. I mean the use case enabled by impl<const SIZE: usize> KeyStorage<SIZE> for [u8; SIZE].. Because:

It's rarely what you actually want. If this is possible, people are going to do this by mistake and have huge stack usage when they most likely did not intend that.
If we ever want to implement zeroize for our key types, it does not make sense that they are on the stack. Because that means there will be an unknown number of copies of the key material in the memory when it's dropped. If zeroize should make any sense, the key material must have existed only in a single location, that can be zeroed out. Not that zeroing out the public key is very relevant I have to admit. But anyway.

So in other words: The public keys probably never want to be on the stack, because it costs too much memory. And the secret keys should probably be kept separate as well, to make it possible to zeroize them at some point?

dkales commented 2 years ago

I was thinking this as well, since most users do not actually want to have this on the stack ever. I'd propose the following:

Make the crate use alloc/std by default, and have the default keypair just own boxed arrays. Remove the owned stack array option. Maybe remove the trait all together, and just make the borrowed version a separate type (maybe even feature-gate this, since it is also more of a fringe use-case). Implement zeroize for both secretkey structs.

Regarding fallible RNGs, I don't know if we want this or not, since it is the only thing requiring error handling atm, and I most callers will not use fallible RNGs or can just seed one from the fallible one.

faern commented 2 years ago

I'm currently working on an experiment with this. I'm trying to represent the keys and key storage buffers in a different way. I hope to have something WIP to show today or in a few days.

faern commented 2 years ago

Basically I don't think we need to represent all keys/buffers in the same way. Because they have different properties.

PublicKey - Not secret, no need to zeroize. Never store on stack. Have the library able to use a borrowed buffer bound by lifetime, or heap allocated owned version
SecretKey - Should implement ZeroizeOnDrop and not be on the stack, to avoid it being copied into many places in the memory. Is not large, but to avoid the lifetime issues, we should allow it to be heap allocated anyway.
Ciphertext - Small and not secret. Can be a simple struct Ciphertext([u8; CIPHER_CIPHERTEXTBYTES]);
SharedSecret - Tiny but secret. Same properties as SecretKey

faern commented 2 years ago

I'm fighting with kems SharedSecret and the zeroize trait here.. It looks like the type is implemented so that Rust automatically copies it all over the stack. But they have implemented ZeroizeOnDrop, indicating it should be cleanup up after itself. But from my testing my entire stack is full of secret key material even after dropping them.

My impl of the keys basically work now. The only thing left is this wart with zeroing the keys when using the kem interface. But maybe that interface is simply not compatible with clearing secrets from memory :shrug:

dkales commented 2 years ago

Seems like this is an inherent problem with Rust/Zeroize: https://docs.rs/zeroize/latest/zeroize/#stackheap-zeroing-notes One would have to Pin the stack memory to prevent moves, but this is not possible with the interface handing out sharedsecrets as GernericArrays.

faern commented 2 years ago

Exactly. To implement the KEM traits one has to pass around the values owned on the stack a lot. So yeah, we can ignore that part.

faern commented 2 years ago

See #20 for my attempt at doing this. Or rather this, plus some other ideas plus zeroize. I didn't mean to "compete" with this PR. I just wanted to show my solution so we could compare and discuss it.

I felt like I had to do it all in one go. To prove to myself, and you, that it would all fit together and actually work.

dkales commented 2 years ago

Superseded by #20

Colfenor / classic-mceliece-rust

feature: implement the RustCrypto KEM API for ClassicMcEliece #16