Closed dkales closed 2 years ago
I think that this PR is currently not breaking the API, right? Was that so we can get something out faster? The main branch in this repository has already broken the API compared to the last release. So the next release will be breaking anyway. So I think we better clean as much as possible up at once instead. But that's just my opinion.
There will be yet another breaking release once the crate can support all variants at the same time instead of having feature flags also. Unless we want to roll that into v2
as well. Which is a bigger job, but doable!
I'd also be partial to yanking 1.0, since it is hard to use correctly at all and publishing this under 1.0.1. @prokls any input on this?
Maybe a simple option would be to have a second keypair_boxed
, that returns Box
ed sk,pk, given alloc
?
I could also see the normal API using these types, and maybe also having boxed variants on alloc?
I'd also be partial to yanking 1.0, since it is hard to use correctly at all and publishing this under 1.0.1.
1.0.1 is already published. So 1.0.2 is the lowest we can go. But personally I think that's a bit dirty. To completely break the API and make it look like some kind of patch release. There is no known bug/issue with the 1.0.x branch, except it being hard to use and should not be used :joy:. I don't see any reason against going 2.0.0
with the next release. We can yank 1.0.x
anyway, but I don't think we should release the brand new API as a patch release to 1.0.
Maybe a simple option would be to have a second keypair_boxed, that returns Boxed sk,pk, given alloc?
I definitely don't like out parameters. But I'm also not a fan of this suggestion. It's simply less flexible. It only allows stack and heap allocated buffers. Consider this potential way of storing your buffer as a global static:
static BUF: Mutex<[u8; 1024]> = Mutex::new([0; 1024]);
fn main() {
let mut public_key = BUF.lock().unwrap();
keypair(&mut public_key);
// do something with `public_key`
}
The best would IMHO be if we could better encode the difference between storage to be used for key material and a filled in key. To make it harder to accidentally use an uninitialized buffer as key material. This should be a fairly common problem to solve for out parameters, so I'm surprised if there is no almost-ready solution for this. But otherwise I guess it's some boilerplate with traits etc. Where keypair
takes in a buffer type and returns the key type (which wrap that borrowed buffer type)
I could also see the normal API using these types, and maybe also having boxed variants on alloc?
I don't think the "boxed variant" should be struct PublicKey(Box<[u8; ..]>)
. But rather they should be Box<PublicKey>
. The heap allocation should not be in the type, rather you should put the entire type on the heap.
This library should not define a special heap type. That should be on the user to construct by just combining existing types and heap allocation methods. I don't think this library should get involved in how the allocation happens, at all. It should just take in a mutable reference to where to put the data.
I'd kind of hate if one could create uninitialized objects even with the new API. I think Out-Parameters are just not used if at all possible most of the time and I also don't like them.
I also don't know how well the creation of boxed Arrays or Boxed array-wrapper structs works in practice right now, I think there were still some cases where a copy was created on the stack as an intermediate step when boxing an array, will have to look into that in more detail.
I think Out-Parameters are just not used if at all possible most of the time and I also don't like them.
Maybe I misunderstood you. But the standard library std::io::Read
trait uses out parameters.
I really don't like them either. But the opposite is worse IMO.
I also don't know how well the creation of boxed Arrays or Boxed array-wrapper structs works in practice right now, I think there were still some cases where a copy was created on the stack as an intermediate step when boxing an array, will have to look into that in more detail.
Only in completely unoptimized builds. When you turn on compiler optimizations these copies are removed is my experience.
I think there were still some cases where a copy was created on the stack
How will you avoid that? Even with PublicKey(Box<[array]>)
you need to init the array first and then it's moved to the heap. But it's only a problem on non optimized builds. So I don't think it matters a whole lot in this case.
What I have in mind is something like the following. We still take a mutable buffer as the argument to the function. Nothing changes there! But the return type is changed from nothing to typed keys. Then we implement the appropriate traits and methods on those types (like you have already done). The difference is that those types don't contain the owned array. They just contain references to the key material. That way they can be constructed from a versatile set of buffer sources. And only this crate can instantiate the key types.
use std::error::Error;
use std::sync::Mutex;
pub const PUBLIC_KEY_SIZE: usize = 1024;
pub const SECRET_KEY_SIZE: usize = 9876;
pub const CIPHERTEXT_SIZE: usize = 1235;
pub struct PublicKey<'a>(&'a [u8; PUBLIC_KEY_SIZE]);
pub struct SecretKey<'a>(&'a [u8; SECRET_KEY_SIZE]);
pub struct Ciphertext<'a>(&'a [u8; CIPHERTEXT_SIZE]);
impl Decapsulator<Ciphertext> for SecretKey {
/*...*/
}
impl EncappedKey for Ciphertext {
/*...*/
}
fn keypair<'a>(
secret_key_buffer: &'a mut [u8; SECRET_KEY_SIZE],
public_key_buffer: &'a mut [u8; PUBLIC_KEY_SIZE],
) -> Result<(SecretKey<'a>, PublicKey<'a>), Box<dyn Error>> {
// Compute the keys
secret_key_buffer[0] = 1;
public_key_buffer[0] = 1;
Ok((SecretKey(secret_key_buffer), PublicKey(public_key_buffer)))
}
// Demonstrate how versatile it becomes to use the above function:
fn main() -> Result<(), Box<dyn Error>> {
let mut stack_secret_buffer = [0u8; SECRET_KEY_SIZE];
let mut stack_public_buffer = [0u8; PUBLIC_KEY_SIZE];
let (stack_privkey, stack_pubkey) = keypair(&mut stack_secret_buffer, &mut stack_public_buffer)?;
let mut heap_secret_buffer = Box::new([0u8; SECRET_KEY_SIZE]);
let mut heap_public_buffer = Box::new([0u8; PUBLIC_KEY_SIZE]);
let (heap_privkey, heap_pubkey) = keypair(&mut heap_secret_buffer, &mut heap_public_buffer)?;
static STATIC_SECRET_BUFFER: Mutex<[u8; SECRET_KEY_SIZE]> = Mutex::new([0; SECRET_KEY_SIZE]);
static STATIC_PUBLIC_BUFFER: Mutex<[u8; PUBLIC_KEY_SIZE]> = Mutex::new([0; PUBLIC_KEY_SIZE]);
let mut static_secret_buffer_lock = STATIC_SECRET_BUFFER.lock().unwrap();
let mut static_public_buffer_lock = STATIC_PUBLIC_BUFFER.lock().unwrap();
let (static_privkey, static_pubkey) = keypair(&mut static_secret_buffer_lock, &mut static_public_buffer_lock)?;
Ok(())
}
I think there were still some cases where a copy was created on the stack
How will you avoid that? Even with
PublicKey(Box<[array]>)
you need to init the array first and then it's moved to the heap. But it's only a problem on non optimized builds. So I don't think it matters a whole lot in this case.
There will be a new box
keyword that should fix this in the future, but I agree it does not matter much right now.
Regarding your proposal, I had something similar in mind as well, however the ergonomics of that solution are also not great, since you may need to keep/pass both the buffer and the Key object around.
It would be nice to maybe re-use the network buffer directly as the backing storage for the public keys, since there is no additional pk-compression like in other schemes. (The above would allow that, although one has to try_into
a slice into a fixed-siced array ref.)
A different variant would be to have the above types, but wrap the inner storage in std::borrow::Cow
. I don't know if that works nicely with arrays (in that the cow enum not as large as the array itself due to the Owned
variant), so it just may be better (in general) that the methods take slices instead of fixed sized arrays and error on invalid sizes.
There will be a new box keyword that should fix this in the future, but I agree it does not matter much right now.
Are there any updates on this? That keyword has been very experimental since Rust 1.0 and last time I read about it it did not sound like it would ever be stabilized, rather be removed.
There will be a new box keyword that should fix this in the future, but I agree it does not matter much right now.
Are there any updates on this? That keyword has been very experimental since Rust 1.0 and last time I read about it it did not sound like it would ever be stabilized, rather be removed.
Seems like you are right based on https://github.com/rust-lang/rust/issues/49733
The Cow
approach is interesting! However, the Cow
type is not in core
, it requires at least alloc
. So if the end goal is to make the crate no_std
then this has to be opt in. The question is if the crate actually aims to be no_std
? It does make things more complicated, so unless there are reasonable use cases I don't see a good reason to strive for it. Will there be embedded applications that does not have allocation support but that are willing to handle cryptography with several hundred kilobytes keys?
So which of these options would you (@faern @prokls) prefer:
alloc
so also not no_std
)Personally I feel structs wrapping arrays or slices is the best compromise to address most of the pain points.
I'd like to avoid compromises if possible :) If we can figure out something that:
'static
keys on the stack if the library user would like to'static
keysno_std
supportThat would be golden. What about below. You need to pass in a storage object to the functions. But instead of being just a &mut [LARGE ARRAY]
it's more flexible thanks to a trait. The trait allows you to pass in owned or borrowed data, both on the stack and (optionally) the heap!
use std::mem::{size_of, size_of_val};
pub const PUBLIC_KEY_SIZE: usize = 98765;
pub struct PublicKey<S: KeyStorage<PUBLIC_KEY_SIZE>>(S);
impl<S: KeyStorage<PUBLIC_KEY_SIZE>> PublicKey<S> {
/// Helper to move a key backed in any possible way to the stack
pub fn to_owned(&self) -> PublicKey<[u8; PUBLIC_KEY_SIZE]> {
PublicKey(*self.0.as_ref())
}
/// Helper to move a key backed in any possible way to the heap
#[cfg(feature = "alloc")]
pub fn to_heap(&self) -> PublicKey<Box<[u8; PUBLIC_KEY_SIZE]>> {
PublicKey(Box::new(*self.0.as_ref()))
}
}
pub trait KeyStorage<const SIZE: usize> {
fn as_ref(&self) -> &[u8; SIZE];
fn as_mut(&mut self) -> &mut [u8; SIZE];
}
impl<const SIZE: usize> KeyStorage<SIZE> for [u8; SIZE] {
fn as_ref(&self) -> &[u8; SIZE] {
self
}
fn as_mut(&mut self) -> &mut [u8; SIZE] {
self
}
}
impl<const SIZE: usize> KeyStorage<SIZE> for &mut [u8; SIZE] {
fn as_ref(&self) -> &[u8; SIZE] {
self
}
fn as_mut(&mut self) -> &mut [u8; SIZE] {
self
}
}
#[cfg(feature = "alloc")]
impl<const SIZE: usize> KeyStorage<SIZE> for Box<[u8; SIZE]> {
fn as_ref(&self) -> &[u8; SIZE] {
self
}
fn as_mut(&mut self) -> &mut [u8; SIZE] {
self
}
}
fn keypair<'a, S: KeyStorage<PUBLIC_KEY_SIZE> + 'a>(mut public_key_storage: S) -> PublicKey<S> {
// Compute the key and fill in the buffer
let key_buffer = public_key_storage.as_mut();
key_buffer[0] = 1;
PublicKey(public_key_storage)
}
fn main() {
// If we give the ownership of the array we get it back owned on the stack.
// Takes large stack space, but the lifetime is 'static, so it can be passed around
let mut stack_pubkey_buffer = [0u8; PUBLIC_KEY_SIZE];
let pubkey = keypair(stack_pubkey_buffer);
assert_eq!(size_of_val(&pubkey), PUBLIC_KEY_SIZE);
std::thread::spawn(move || drop(pubkey));
// If we lend the buffer to `keypair` we don't eat up our stack, but the returned
// type has a lifetime bound to the buffer, so it can't be passed around as smoothly.
let pubkey = keypair(&mut stack_pubkey_buffer);
assert_eq!(size_of_val(&pubkey), size_of::<usize>());
//std::thread::spawn(move || drop(pubkey)); <- Does not compile
// However, we can always convert it into an owned version of itself
let stack_pubkey = pubkey.to_owned();
std::thread::spawn(move || drop(stack_pubkey));
drop(pubkey);
#[cfg(feature = "alloc")]
{
// If we allocate the buffer on the heap and transfer ownership to `keypair`, it
// can use the heap buffer and return the ownership of the same data without cloning
// or moving. The resulting `PublicKey<Box<...>>` is 'static
let heap_pubkey_buffer = Box::new([0u8; PUBLIC_KEY_SIZE]);
let pubkey = keypair(heap_pubkey_buffer);
assert_eq!(size_of_val(&pubkey), size_of::<usize>());
std::thread::spawn(move || drop(pubkey));
}
#[cfg(feature = "alloc")]
{
// We can even use the heap stored buffer in a borrowing way
let mut heap_pubkey_buffer = Box::new([0u8; PUBLIC_KEY_SIZE]);
let pubkey = keypair(&mut *heap_pubkey_buffer);
assert_eq!(size_of_val(&pubkey), size_of::<usize>());
// std::thread::spawn(move || drop(pubkey)); <- Does not compile
drop(pubkey);
}
}
This seems quite nice, I guess I'll try to integrate this in this PR later today.
Maybe also exposing a second keypair method without parameters on alloc
that just returns Box
-backed PK/SK for ease of use is a good idea as well, since most of the time users probably wont care about the memory layout.
Since the ciphertext (<=240 bytes) and shared secret (32 bytes) are small anyway, it is probably ok to not do this for them, but keep them as arrays or generic_array
s.
This sadly not that compatible with the KEM API:
impl<S: KeyStorage<CRYPTO_PUBLICKEYBYTES>> EncappedKey for Ciphertext {
type EncappedKeySize = CryptoCiphertextBytesTypenum;
type SharedSecretSize = typenum::U32;
type SenderPublicKey = ();
type RecipientPublicKey = PublicKey<S>; // <-- this is not constrained in the impl
fn from_bytes(bytes: &GenericArray<u8, Self::EncappedKeySize>) -> Result<Self, kem::Error> {
let mut data = [0u8; CRYPTO_CIPHERTEXTBYTES];
data.copy_from_slice(bytes.as_slice());
Ok(Ciphertext(data))
}
}
Since it wants the public key to be a concrete type, we would also need the Ciphertext
to be generic over the public key it came from, even though it does not really matter internally...
I guess just making Ciphetext into
struct Ciphertext<S: KeyStorage<CRYPTO_PUBLICKEYBYTES>> {
data: [u8; CRYPTO_CIPHERTEXTBYTES],
marker: PhantomData<S>
}
would work, but this also does not seem that nice.
Would you be able to push your WIP branch somewhere? So I can play with it?
@faern see the latest commit, I tried it with the PhantomData
workaround, and from a user perspective it is not noticeable at all if you do not have to name the concrete Ciphertext
type.
The example in the Readme.md
covers all the use cases of your comment above (and is doc-tested).
(EDIT: I guess the Readme test still stack-overflows on the github-runners, but it works locally, I guess the default stack size of those runners is configured lower).
I now question whether or not we actually want to allow passing the huge public keys on the stack at all. I mean the use case enabled by impl<const SIZE: usize> KeyStorage<SIZE> for [u8; SIZE]
.. Because:
zeroize
for our key types, it does not make sense that they are on the stack. Because that means there will be an unknown number of copies of the key material in the memory when it's dropped. If zeroize
should make any sense, the key material must have existed only in a single location, that can be zeroed out. Not that zeroing out the public key is very relevant I have to admit. But anyway.So in other words: The public keys probably never want to be on the stack, because it costs too much memory. And the secret keys should probably be kept separate as well, to make it possible to zeroize
them at some point?
I was thinking this as well, since most users do not actually want to have this on the stack ever. I'd propose the following:
Make the crate use alloc
/std
by default, and have the default keypair
just own boxed arrays.
Remove the owned stack array option.
Maybe remove the trait all together, and just make the borrowed version a separate type (maybe even feature-gate this, since it is also more of a fringe use-case).
Implement zeroize
for both secretkey structs.
Regarding fallible RNGs, I don't know if we want this or not, since it is the only thing requiring error handling atm, and I most callers will not use fallible RNGs or can just seed one from the fallible one.
I'm currently working on an experiment with this. I'm trying to represent the keys and key storage buffers in a different way. I hope to have something WIP to show today or in a few days.
Basically I don't think we need to represent all keys/buffers in the same way. Because they have different properties.
struct Ciphertext([u8; CIPHER_CIPHERTEXTBYTES]);
SecretKey
I'm fighting with kem
s SharedSecret and the zeroize
trait here.. It looks like the type is implemented so that Rust automatically copies it all over the stack. But they have implemented ZeroizeOnDrop
, indicating it should be cleanup up after itself. But from my testing my entire stack is full of secret key material even after dropping them.
My impl of the keys basically work now. The only thing left is this wart with zeroing the keys when using the kem interface. But maybe that interface is simply not compatible with clearing secrets from memory :shrug:
Seems like this is an inherent problem with Rust/Zeroize: https://docs.rs/zeroize/latest/zeroize/#stackheap-zeroing-notes
One would have to Pin
the stack memory to prevent moves, but this is not possible with the interface handing out sharedsecrets as GernericArray
s.
Exactly. To implement the KEM traits one has to pass around the values owned on the stack a lot. So yeah, we can ignore that part.
See #20 for my attempt at doing this. Or rather this, plus some other ideas plus zeroize
. I didn't mean to "compete" with this PR. I just wanted to show my solution so we could compare and discuss it.
I felt like I had to do it all in one go. To prove to myself, and you, that it would all fit together and actually work.
Superseded by #20
This is the first attempt for the RustCrypto KEM API.
A few growing pains:
generic_array
since const generics are not far enough yet, is a bit of a hassle with typenum.no_std
, but there should be an option to box the KeyPair/Ciphertexts.Any comments @faern @prokls ?