rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.8k stars 1.55k forks source link

Figure out in-tree crypto story #766

Open steveklabnik opened 9 years ago

steveklabnik commented 9 years ago

Issue by brson Wednesday Jun 04, 2014 at 22:05 GMT

For earlier discussion, see https://github.com/rust-lang/rust/issues/14655

This issue was labelled with: A-libs in the Rust repository


We've previously made the decision not to distribute any crypto with Rust at all, but this is probably not tenable since crypto is used everywhere. My current opinion is that we should not distribute any crypto written in Rust, but that distributing bindings to well-regarded crypto is fine.

Figure out a strategy here, build consensus, then start implementing a robust crypto library out of tree, with the goal of merging into the main distribution someday. There are some existing efforts along these lines that should be evaluated for this purpose.

dgryski commented 9 years ago

PolarSSL will be relicensed from GPL to Apache, making it an alternative to OpenSSL. http://community.arm.com/groups/internet-of-things/blog/2015/02/09/polarssl-is-dead-long-live-mbed-tls

burdges commented 9 years ago

It's worth mentioning that LLVM cannot create branchless constant time code, so either one should link libraries with assembly imeplementations, or else pursue another compiler option, like https://github.com/klutzy/nadeko perhaps.

See https://moderncrypto.org/mail-archive/curves/2015/000463.html and https://moderncrypto.org/mail-archive/curves/2015/000466.html

burdges commented 9 years ago

There is a simpler question worth addressing too : Is cryptographic key material copyable? Apperently no : https://github.com/rust-lang/rust/pull/20631

We should zero out cryptographic keys when they're dropped, ensuring that LLVM does not optimize it away, as done in sodiumoxide via libsodium : https://github.com/dnaq/sodiumoxide/blob/master/src/newtype_macros.rs

It follows that types containig crpyographic keys must have a drop method that zeros them, and therefore cannot be copyable. It follows that keys will use move semantics, so.. Is it correct that a move will drop what is being moved from?

In practice, we'll commonly use pointers anyways since key material is at least 16 bytes long, but this seemed worth clarifying.

As an aside, we should call mlock and munlock on them too, although that might impact more than I realize. Appears https://github.com/seb-m/tars does eactly this, but implementing full allocators sounds heavy handed. All that's required is a wrapper that calls mlock and munlock like what libsodium does : https://github.com/jedisct1/libsodium-doc/blob/master/helpers/memory_management.md

burdges commented 7 years ago

We should maybe build just a framework for cryptographic crates rather than trying to pack all the cryptographic algorithms into one big crate, so just the basic traits, etc. for handling key material, digests, stream cyphers, block cyphers, boxes, etc. but not the actual algorithms.

Philipp91 commented 7 years ago

@burdges I'm currently exploring that idea, starting with hash functions because they're the simplest. I have implementations for these traits in the same crate, which act as a kind of adapter to the actual crypto libraries, which can be swapped similar to rust-native-tls. So it's not the crypto library that needs a dependency on the "framework" crate, as you call it, but the framework crate has a dependency on a single crypto library--which one depends on your cargo features. The end result is similar in that there is a common API, anyone who codes against it can use any implementation, and low-level implementations don't need to live in the same crate.

burdges commented 7 years ago

Isn't that slightly more complicated than making the algorithms depend on the framework crate?

Anyways, there are many crypto algorithm that would benefit from a common base that provides constant time operations. I donno if those should simply be added to core ala https://github.com/rust-lang/rfcs/issues/1814 or should be the start of a crypto crate, maybe starting with https://github.com/valarauca/consistenttime or something similar, like util* in https://github.com/DaGenix/rust-crypto/tree/master/src or https://github.com/isislovecruft/curve25519-dalek/blob/master/src/util.rs

valarauca commented 7 years ago

I've made a write up of nadeko's issues. It isn't constant time, and really does not solve the core issues with being constant time. There is a very high level of AST manipulation and understanding of intent that goes into making something constant time. I truely believe we cannot just have it opt in.

I like the idea of a crypto frame work.

Exposing a handful of functions:

This will make crypt-library authors jobs a lot easier.

Philipp91 commented 7 years ago

Let's keep two things separate. The first is "under" the implementations of crypto primitives and the second is "on top" of them:

burdges commented 7 years ago

I said "framework" because:

There are a bunch of useful traits in https://github.com/DaGenix/rust-crypto/tree/master/src for things like hash digests, stream ciphers, encryption and decryption with stream cipher, ctr modes, etc. In principle, one could build upon those, except some design decisions there seem questionably disjoint from std and need either improving or explaining.

Why no in-place operation of stream ciphers? I'd imagine that's just an omission. Should stream ciphers act as iterators though? I donno. If so, should appropriate stream ciphers like ChaCha20 use general purpose iterator machinery for seeking and/or translation from an iterator that operates in blocks. Should digests input stuff that's Hash? etc.

Answering these questions publicly, and/or adjusting traits to do these things, seems important, as otherwise every hash function or cipher developer need to think about these question themselves. It's true constant-time operations are a step below this traits "framework" stuff, but they are similar in that anyone writing crypto primitives needs to think about them if they are not being done right elsewhere.

Philipp91 commented 7 years ago

I see your point of saving crypto library implementors from reinventing the wheel all the time. That's a major benefit of such a "framework". Another benefit is that it's one single API for crypto users, similar to JCA in Java. And, hopefully, it's a pretty good API because a lot of work went into designing it by clarifying questions like the one you raised.

While a library implementor could use both the outward-facing traits and the low-level helpers, the crypto user should better not be able to access the low-level stuff directly, so it should at least be in a separate module. And I argue it should be in a separate crate because not all crypto libraries really need it. There are quite many which are actually just wrappers of some C library (OpenSSL, BoringSSL, libsodium), so they wouldn't need the constant-time operations in Rust. Instead, they are based on other (common) low-level things like libc, some FFI generator or something like that. And they work under the same high-level traits as the crypto APIs implemented in Rust itself.

And I agree that it's good to answer these questions publicly. Maybe, the API/traits and the lower-level functions are two separate discussions, though, because there are different stakeholders. And probably neither of the discussions belongs into this thread, but I would have many details to discuss already :-) including iterators in cipher APIs.

You name rust-crypto as a base that could be adapted. Have you also looked at others? This topic has been discussed in the #rust-crypto IRC a few weeks back and it seemed to me that ring's API is more well-regarded.

burdges commented 7 years ago

A few constant time building blocks sound small compared to say pulling in an extra hash function, much less AES and RSA, like every existing crypto library does.

Aren't those Algorithm structs in ring too confining for a general purpose crypto library? Those fit well both with ring's history in BoringSSL of course, and more generally for providing TLS, but a more rust-like API would probably be Algorithm::new(..) and traits, like rust-crypto.

I'd say leave any "stakeholders" who want uber-safety to using ring and sodiumoxide. Those projects make a range of safety decisions, including evaluating if the Rust code is mature enough to replace solid C code. Rust is not lacking in foot gun free crypto libraries.

Rust is weaker on flexible building blocks though:

We want to simplify building a separate crate to provide a crypto primitive, without needing to dig too deeply into rustc or LLVM. Right now, people send pull requests to rust-crypto, but that project should not be responsible for everyone's pet cipher or hash function. I suppose just breaking rust-crypto in a crypto-current and crypto-legacy crates would require exposing enough building blocks that new ciphers and hash functions need not send pull requests.

We want to simplify using such crates in protocols, including unconventional protocols involving stuff like blind signatures, Axolotl, elligator, semi-private keys, mixnets, ECQV, post-quantum, etc. We should not worry about public-key algorithms right now, as stand alone crates do those well enough, but the symmetric crypto should not stand in the way when people when do unusual asymmetric stuff.

Applications are increasingly needing these unconventional usages. If you insist on avoiding all foot guns, then either important new things get built in languages like Go, or else everyone maintains their own private fork of the crypto library. And I doubt you'll do as good a job of being foot gun free as ring anyways.

In this vein, curve25519-dalek is the right approach to doing a public key library. All the mathematical submodules are well documented and exported. Fixed-size arrays are used everywhere with the help of the arrayref crate, so no slices. etc.

tarcieri commented 7 years ago

@Philipp91

the framework crate has a dependency on a single crypto library--which one depends on your cargo features. The end result is similar in that there is a common API, anyone who codes against it can use any implementation, and low-level implementations don't need to live in the same crate.

This approach makes less sense to me than having a single set of traits multiple crypto libraries can target. It would require that libraries which want to buy into a common crypto core be rewritten to use those traits, but to me, that's the "right answer", as opposed to trying to make a one-size-fits-all wrapper to the divergent APIs present in existing crypto libraries.

As @burdges noted, there are already some good traits which could be extracted for this purpose in rust-crypto.

As a former JCA user, it was, like many Java things, an extremely overcomplicated and bad abstraction (the entire Java crypto ecosystem is the stuff of nightmares). It had somewhat noble goals, but from a practical perspective was a giant pain to work with.

I would prefer an approach that leans on Rust's type system more, with simple abstractions built on common traits, and no need to lean on cargo features: just include the crates with the algorithms you want, all built on a common set of traits, and everything just works. I think this approach could potentially still enable things like "ask for a named instance of a cipher at runtime"

That way, crappy algorithms people shouldn't be using can be spun out into separate crates, complete with appropriate "caveat emptor" warnings for their users.

tarcieri commented 7 years ago

@burdges

As an aside, we should call mlock and munlock on them too, although that might impact more than I realize. Appears [tars] does eactly this

There are now a litany of incompatible Rust libraries for memory protected secret storage:

If nothing else, it seems like something else that could benefit from a common trait.

but implementing full allocators sounds heavy handed. All that's required is a wrapper that calls mlock and munlock like what libsodium does : https://github.com/jedisct1/libsodium-doc/blob/master/helpers/memory_management.md

libsodium does more than that: it adds guard pages at the beginning and end of the region using mprotect() to protect against memory scraping or Heartbleed-style attacks. Ideally for something like this you'd have a sort of arena you allocate once for n-keys so you can store many in the same buffer.

Another popular approach is to use a master key-encrypting-key (KEK), storing all secrets encrypted in memory, and using the KEK to decrypt them on demand for a particular operation.

All that said, this really looks to me like another case that can benefit from a common trait, and depending on your paranoia level you can pick the exact strategy/crate you want to safeguard your keys/other secrets.

briansmith commented 7 years ago

Aren't those Algorithm structs in ring too confining for a general purpose crypto library? Those fit well both with ring's history in BoringSSL of course, and more generally for providing TLS, but a more rust-like API would probably be Algorithm::new(..) and traits, like rust-crypto.

FYI, the design of ring's API has pretty much nothing to do with BoringSSL's. ring avoids modeling everything using traits because we do things that Rust's traits don't support (yet). Surprisingly, many of the unusual things we do in ring actually have reasonable explanations that are readily available to anybody who asks the ring developers about them.

Philipp91 commented 7 years ago

@tarcieri

First, I agree that a common set of traits would be the nicer solution. It comes with many advantages. For example, a higher-level library like one of the many JWT libraries (https://github.com/mikkyang/rust-jwt/blob/96fd54b36ad631352e46a33565fd20dd4e26b6e3/src/crypt.rs) could only depend on "something that computes a HMAC for me", without choosing a crypto implementation for its user (possibly messing up their build).

But, like @briansmith, I have found some things you can't do with traits, yet. Especially when I compare it to the cargo-feature-switching solution, there are some downsides. But I see these as a technical challenges, and maybe we can solve them (now already) to build such a set of common traits.

So to be more concrete: Traits can't be the return type of a function, yet. As an example, consider ring's Digest type (https://briansmith.org/rustdoc/ring/digest/struct.Digest.html), which is returned from hash functions. If we had a hashing trait with some function fn hash(...) -> Digest, then Digest would need to be a struct that is also defined in the common library. Similar problems would arise with functions that generate private/public keys, and in some other places. Please correct me if I'm wrong or if you see a good solution here. Maybe wait for abstract return types (#34511)?

Another issue: I would like a global function (again I'm using hashing as the example) that just does the simplest job without even letting the user choose an algorithm, e.g., crypto::library::hash::hash("input data"). This would probably use some SHA algorithm under the hood. Every crypto library would individually have to implement this function and there is no way to impose its signature with a trait. Hence, it is also not possible to write helper functions around such global functions (e.g. one that accepts a file path and hashes the file's contents). This problems seems less important, I can't think of other examples. Plus, the default algorithm could just be aliased Default.

Related issue: It would be great if the implementations were easily exchangeable for basic use cases. (If the common library could reference the implementation, this would be done with a build flag.) I imagine this could be achieved if all crypto libraries followed a similar structure. A user who wants extreme exchangeability (not really necessary, but it shows the point) could use extern crate openssl as crypto and replace it with extern crate ring as crypto, and a few things like crypto::hash::SHA256::hash(&my_vector).to_hex() would still work. This could be achieved by discipline (and quite a lot of restructuring of the existing libraries), but it cannot be enforced with traits, as far as I can tell. So there would need to be some structure guidelines for library developers.

(@briansmith Why does ring call it digest instead of hash?)

tarcieri commented 7 years ago

Traits can't be the return type of a function, yet. As an example, consider ring's Digest type (https://briansmith.org/rustdoc/ring/digest/struct.Digest.html), which is returned from hash functions. If we had a hashing trait with some function fn hash(...) -> Digest, then Digest would need to be a struct that is also defined in the common library.

This is a particularly interesting case: I like the ergonomics of how ring's Digest type works as an end user, but under the hood it's bound to a max length on the digest size. That's fine for most hash functions, but I'm not sure it works for all keyed hash functions that act more like a KDF and have variable-length outputs, as it internally relies on a fixed-size byte array. Perhaps @briansmith has thoughts here and per his comment I would like to know more about ring design decisions.

Having a common Digest struct in the standard library seems like a good thing to me, provided it's a one-size-fits-all solution. But perhaps the rust-crypto "provide a byte slice into which to write the output" fits that bill better.

Why does ring call it digest instead of hash?

IMO "digest" as the output of a hash function is the correct term.

briansmith commented 7 years ago

(@briansmith Why does ring call it digest instead of hash?)

This was done mostly to distinguish it from std::hash, hash map/set stuff, and other things that don't need a cryptographic hash.

I like the ergonomics of how ring's Digest type works as an end user, but under the hood it's bound to a max length on the digest size. That's fine for most hash functions, but I'm not sure it works for all keyed hash functions that act more like a KDF and have variable-length outputs, as it internally relies on a fixed-size byte array.

A KDF's output isn't a digest of any message, so it doesn't make sense to use Digest for its output. In ring we currently use &[u8] as the output type of KDF operations, but that's likely to change soon.

tarcieri commented 7 years ago

@briansmith makes sense. BLAKE2b was probably my specific concern, but its MAX_OUTPUT_LEN is 512-bytes, which is equal to what ring provides. So... carry on then.

Philipp91 commented 7 years ago

It's still a good point, because hashing is only the example here.

provide a byte slice into which to write the output

I also like the Digest ergonomics, and an output: &mut [u8] would be terribly unergonomic, in my opinion. You have to go find out the right size to allocate (first search the documentation, then probably call some method because it's dynamic), allocate a vector, retrieve data and then sometimes even truncate the result to some other length you got from elsewhere. If the result is actually the return value of another one, it can even be used inline or as a function pointer in a functional programming style.

And there are other return values that are definitely not fixed size, like ciphertexts. But if there is no other way, we could have something like a fn result_into(&mut self, out: &mut [u8]) that libraries implement and that performance-sensitive users use, but the regular result function would be fn result(&mut self) -> Vec<u8>, i.e., the one that does the work for you and calls result_into under the hood.

Currently, I still have the solution with the associated Digest type on the trait that every implementation needs to specify. But because I want it to be PartialEq, Index, and all these things, this probably means that every crypto library would have to copy over the implementations of those. I don't see a way to provide it generically in the "API crate". https://github.com/Philipp91/rust-crypto-api/tree/c5662d5e05f8586dcfea4839ff2403601179d715/src/hash

IMO "digest" as the output of a hash function is the correct term.

I agree, it's the right term for the result. But the module is also named ring::digest, yet inside it you won't find digests, and you won't find "digest functions", even though some cryptographers seem to use that term? The average user is looking for a (cryptographic) "hash function" and expects to find it in the hash module.

briansmith commented 7 years ago

I agree, it's the right term for the result. But the module is also named ring::digest, yet inside it you won't find digests, and you won't find "digest functions", even though some cryptographers seem to use that term?

In ring::signature, the result type is Signature, and in ring::digest the result type is Digest.

The average user is looking for a (cryptographic) "hash function" and expects to find it in the hash module.

OpenSSL seems to also prefer the term "digest," e.g. <openssl/digest.h>. See also java.security.MessageDigest. See also PKCS#11, e.g. C_Digest.

I also agree that &[u8] and &,mut [u8] are often not the best way to pass data around. I would prefer to use io::{Read, Write}, but they don't work in #![no_std] mode, so I can't, since that's a requirement for (almost all of) ring.

Regarding whether digests should implement PartialEq, Index, etc., I'm not planning to do that in ring. At least for now, ring::digest::Digest does implement AsRef<[u8]> for people that want those features. I'm hoping we'll be able to get rid of that in the future though I don't have a plan for doing so.

Philipp91 commented 7 years ago

#![no_std]

I'm not familiar with no_std at all. As far as I can tell, in a no_std environment, you don't have std::vec::Vec, correct? And no heap? So there is no way you return a ciphertext from a function other than through a &mut [u8], correct? Maybe there's a way that I don't see right now.

This is all well for ring and other libraries, because they have to work in these environments. Among the goals of common API traits should be usability and misuse resistance, though. And that doeesn't go very far without vectors, std::io, and friends. So unless there is a pretty high number of no_std crypto users, it's probably not worth supporting this use case in the simple API. After all, these common traits are designed to be std-like and maybe even move into std eventually. So a no_std user would just have to use the regular ring API directly, and they know their way around low-level APIs anyway.

The current plan would make ring and other crypto libraries depend on this API crate, and thus transitively depend on std. So either the API crate needs a cargo feature to en/disable no_std, or the crypto libraries need one to turn off the dependency on the API crate. (Or the API crate depends on the crypto libraries instead of the other way round...).

I would prefer to use io::{Read, Write}, but they don't work in #![no_std] mode

If there is a solution that allows ring to compile in #![no_std] mode and allows it to be used through a simpler API (trait), then ring wouldn't even need to use std::io itself, but the higher-level API could take care of that and feed the data into ring's lower-level functions.

Regarding whether digests should implement PartialEq, Index, etc., I'm not planning to do that in ring

sodiumoxide/rust_sodium seem to go the other way round (http://docs.maidsafe.net/rust_sodium/master/rust_sodium/crypto/hash/sha256/struct.Digest.html) and apparently implement everything they possibly can (even core::hash::Hash, so you can sip-hash your cryptographic hash digest?).

What is your motivation not to have at least the more useful trait implementations in ring?

PartialEq could be pretty important: Vulnerabilities to timing attacks have happened in the past because users compared hash values "normally", which is why rust_sodium implements PartialEq with a constant-time comparison: http://docs.maidsafe.net/rust_sodium/master/src/rust_sodium/src/newtype_macros.rs.html#32-38. For misuse resistance, offering a constant-time comparison function somewhere is not enough, it also needs to be the "default" thing that happens, i.e. it needs to be obvious that you have to use it and how you can use it. And PartialEq is just the easiest to use through the == operator.

briansmith commented 7 years ago

What is your motivation not to have at least the more useful trait implementations in ring?

PartialEq could be pretty important: Vulnerabilities to timing attacks have happened in the past because users compared hash values "normally", which is why rust_sodium implements PartialEq with a constant-time comparison

My goal in ring is to make it unnecessary for ring users to compare digests for equality and then make it impossible for them to do so. Thus all timing attacks will be averted.

Philipp91 commented 7 years ago

Wouldn't that imply that your Digest doesn't let the user access the underlying hash value at all? So no way to get a [u8], Vec<u8> or hexadecimal String? Those are currently by far the most common use cases -- though they're probably influenced by what the crypto APIs currently offer.

briansmith commented 7 years ago

Wouldn't that imply that your Digest doesn't let the user access the underlying hash value at all?

You are probably right that the "make it impossible for them to do so" part of my goal is unreasonable because the entire point of the ring::digest API is to compute the final digest in a way that can be extracted. However, I hope that for any case where one would want to compare digest output for equality, we can instead offer a better API that avoids even exposing the digest at all. For example, the ring HMAC API, the ring signature verification API, and the ring AEAD opening API all avoid exposing the value of the computed digest (or authentication tag) at all; i.e. they succeed in making it impossible for the user to screw up because the user can't do the comparison themselves. (Currently, one can abuse the signing/sealing APIs to do the dumb thing, but we're not far from fixing that, I think.)

Philipp91 commented 7 years ago

offer a better API

... and advertise it / make it the default API / make it the easiest to use

the ring HMAC API, the ring signature verification API, and the ring AEAD opening API

I begin to think that hashing might not be the best example to use all the time. It's just the one that I analysed first because it's the simplest kind of API. But there are many cases where people use hashes for non-crypto purposes and hence they use them in ways that would be unsound for crypto purposes. HMAC/AEAD/... are probably only used for actual security-critical purposes, so I'm going to look at those next.

In the HMAC case: doesn't ring::hmac::sign(...).as_ref() expose the digest? Though I see that, if it didn't, everything would still work because you have and advertise the verify functions.

Could there be any cases where the HMAC digest needs to be sent over the wire, saved on disk or stored in a database? My early-December analysis turned up 4 ring::hmac users on GitHub, and 3 of them seem to (think that they) need a serialized HMAC digest: https://github.com/Keats/rust-jwt/blob/master/src/lib.rs#L139 https://github.com/badboy/nobsign/blob/master/src/lib.rs#L120 https://github.com/rusoto/rusoto/blob/master/src/signature.rs#L226

Maybe this discussion should move somewhere where crypto API design is discussed -- and this thread should be about whether to include any crypto (API, implementation, traits, ...) into the standard library or rust-lang-nursery.

briansmith commented 7 years ago

In the HMAC case: doesn't ring::hmac::sign(...).as_ref() expose the digest? Though I see that, if it didn't, everything would still work because you have and advertise the verify functions.

Right. We need a way to output an HMAC value so that it can be serialized, and because HMAC uses the same key for verification and signing, one can (ab)use SigningKey to do the traditional "sign-and-compare" stuff. That's the abuse of the signing API I mentioned in my earlier comment. I hope that we can improve the HMAC signing API to make that kind of misuse impossible or at least more obviously wrong. I'd like to try to do that before I implement any workarounds for the possibility of that misuse.

Maybe this discussion should move somewhere where crypto API design is discussed -- and this thread should be about whether to include any crypto (API, implementation, traits, ...) into the standard library or rust-lang-nursery.

I agree that that this discussion is probably getting ahead of the actual issue that is supposed to be discussed here. Regarding what's actually supposed to be discussed here, I think there is so much great work being done now in a variety of crates that we should just shelve this issue, let people continue to do that great work throughout 2017, and revisit the issue when it's time to build the 2018 roadmap.

burdges commented 7 years ago

A trait's associated type cannot by used to determine the return type of this function @Philipp91 ?

Philipp91 commented 7 years ago

I'm not exactly sure what you mean by "this function", but it certainly can be used like this: https://github.com/Philipp91/rust-crypto-api/blob/c5662d5e05f8586dcfea4839ff2403601179d715/src/hash/mod.rs#L25

burdges commented 7 years ago

Yup. I just miss understood something you wrote up thread then.