briansmith / ring

Safe, fast, small crypto using Rust
Other
3.7k stars 698 forks source link

Add RSA key generation #219

Open briansmith opened 8 years ago

briansmith commented 8 years ago

This should build on #208. We should consider just fixing the public exponent at 65537, since that is safe and fast (for verifiers). We should also consider limiting the public modulus size to 2048-8192 bits.

Instead of making this so that it returns an RSAKeyPair and adding a private_key_bytes() method for serializing it, I suggest that we just make a function that returns the bytes of an already-serialized RSA private key.

There is code implementing the math in the function RSA_generate in crypto/rsa/rsa_impl.c. I think NIST may have test vectors that we can use for verifying the implementation.

briansmith commented 8 years ago

Besides returning the serialized private key in RSAPrivateKey form, we also need it to return the public key serialized RSAPublicKey form, so that we can put the public key in a CSR, certificate, or other certificate-like thing.

briansmith commented 8 years ago

See https://github.com/briansmith/ring/issues/224 regarding the serialization format.

briansmith commented 8 years ago

We should check out https://github.com/zcdziura/pumpkin and see if we can use it.

I would like to use elliptic curve primality proving, or equivalent, as the final step of the prime generation.

djc commented 8 years ago

It requires rustc nightly, which should be a no-no for ring.

briansmith commented 8 years ago

See https://github.com/briansmith/ring/issues/145.

briansmith commented 8 years ago

It requires rustc nightly, which should be a no-no for ring.

I believe it only requires Rust Nightly because it uses Ramp. We can replace the use of Ramp with something else.

briansmith commented 8 years ago

https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_svenda.pdf

briansmith commented 7 years ago

See https://github.com/golang/go/commit/37d078ede386b5d0dff0bb1f3ea77e47122730d0 and the related GitHub discussion.

briansmith commented 7 years ago

More thoughts:

briansmith commented 7 years ago

We should investigate whether we should do stricter small-prime-difference and/or other similar checks during generation than what we do for loading a private key. See http://deweger.xs4all.nl/papers/[33]dW-SmlPrDif-AAECC[2002].pdf and similar, which recommend stronger checks. (Maybe we should do them for loading too.)

briansmith commented 7 years ago

See the notes in the duplicate #145 about ways we may want to improve upon what BoringSSL/OpenSSL do.

jprider63 commented 6 years ago

What's the status of this? I see mention of a C implementation, but I don't see it in the repository. I noticed there is some Rust code in src/rsa.

A while ago, I worked on a Rust implementation of this if it's useful (not production ready yet though).

light4 commented 6 years ago

ping

briansmith commented 6 years ago

In the absence of a better design (which I totally would love to see), it is good to follow David Benjamin's new RSA keygen algorithm in BoringSSL, where the highest-level logic is in Rust. It is OK to use any of the code that is ISC-licensed in https://boringssl.googlesource.com/boringssl/+/bb3a456930f731563c7dfbd08656d1dff767e6d0 (in addition to what's already in ring).

jprider63 commented 6 years ago

Are there any reasons not to use pure Rust for key generation? Speed? I don't think timing channels would be an issue. I'd prefer a design where a Rust multi precision library is used for the integer operations.

briansmith commented 6 years ago

Are there any reasons not to use pure Rust for key generation? Speed? I don't think timing channels would be an issue. I'd prefer a design where a Rust multi precision library is used for the integer operations.

Five main reasons:

  1. All the low-level math in ring is done in assembly language code and C code, and this is pretty fundamental to ring 0.x. (ring 2.x might change this.)
  2. Speed is a major issue for RSA key generation, because RSA key generation is inherently very slow.
  3. Timing side channels are actually a real concern for some scenerios. BoringSSL just rewrote all their RSA key generation to be constant-time(-ish), for example. In at least one product that's embedding ring, timing side channels for key generation are a real issue. Note that all the other key generation in ring is already constant-time(-ish).
  4. We partially rely on the split of Rust vs. C/asm in ring to guarantee some side channel resistance of the code. None of the constant-time building blocks in ring is in Rust right now, although many constant-time(-ish) algorithms have the higher levels written in Rust.
  5. No doubt we could replace some of that C code with Rust code and make it work well, but I fear that the development time would take a long time and people might get burned out during the code review.
jprider63 commented 6 years ago

Five main reasons:

All the low-level math in ring is done in assembly language code and C code, and this is pretty fundamental to ring 0.x. (ring 2.x might change this.) Speed is a major issue for RSA key generation, because RSA key generation is inherently very slow. Do you know the difference in speed of Rust vs C/asm for key generation? Personally, I think the improvement in memory safety is worth a bit of slowdown. Timing side channels are actually a real concern for some scenerios. BoringSSL just rewrote all their RSA key generation to be constant-time(-ish), for example. In at least one product that's embedding ring, timing side channels for key generation are a real issue. Note that all the other key generation in ring is already constant-time(-ish). Could you describe how timing side channels for RSA key generation are a concern? To generate a key, you generate a large random number and test if it’s prime. If it’s not prime, repeat. How can an attacker learn what the key is, especially if the attacker doesn’t know how many loops have been performed? We partially rely on the split of Rust vs. C/asm in ring to guarantee some side channel resistance of the code. None of the constant-time building blocks in ring is in Rust right now, although many constant-time(-ish) algorithms have the higher levels written in Rust. No doubt we could replace some of that C code with Rust code and make it work well, but I fear that the development time would take a long time and people might get burned out during the code review. Tooling could make the code review simpler. In Haskell, there’s a library called inspection-testing https://github.com/nomeata/inspection-testing that allows the user to write tests against intermediate/generated code. I know it has been used to ensure the compiler performs optimizations as expected. Something similar could be used to make sure timing channels aren’t introduced by the compiler.

briansmith commented 6 years ago

How can an attacker learn what the key is, especially if the attacker doesn’t know how many loops have been performed?

This is not the way we reason in ring. In ring we assume all all timing attacks on secrets can be exploited unless/until proven otherwise. So, somebody would need to provide a proof that timing attacks on variable-time RSA key generation is not exploitable for us to use a variable-time implementation. (If BoringSSL hadn't already shown us how to do a constant-time one, we might have punted constant-timedness down the road, but since they did, we might as well learn from them.)

est31 commented 5 years ago

I'd love to have RSA key generation and I'd make a PR but I wonder how the API should look like. Should it be a new top level module of ring? How should it be called? Or does it fit into another module?

est31 commented 5 years ago

@briansmith any pointers?

briansmith commented 5 years ago

I'd love to have RSA key generation and I'd make a PR but I wonder how the API should look like. Should it be a new top level module of ring? How should it be called? Or does it fit into another module?

It should look something like ring::ec::suite_b::ecdsa::signing::Key::generate_pkcs8. pkcs8::Document probably needs to be modified so that for RSA keys, it stores the serialized form in a Box<[u8]>. It should live in submodule ring::rsa::keygen and be exposed publicly through ring::signature.

In general, the API stuff will not be hard to work out so I suggest you just get it working first. For RSA keygen, the much harder part is doing all the math. The main question this: Which algorithm are we going to use? And then immediately, what's the outline of a plan for implementing that non-trivial algorithm? Is it possible to use only the primitive math operations we have now (modular addition, subtraction, shifts, Montgomery multiplication, exponentiation via Montgomery multiplication), or do we need to implement add primitives? If so, which ones? (Note that we removed most of the primitives from BoringSSL that are used only for RSA key generation. If we need to add them back, we need to know which ones ahead of time so we can figure out what code from BoringSSL is OK to borrow.)

For background, please read https://eprint.iacr.org/2018/749 and https://tools.ietf.org/html/draft-mavrogiannopoulos-pkcs8-validated-parameters-00 and about Elliptic Curve Primality Proving. Also read FIPS 186-4 Appendix B.3 (https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.186-4.pdf).

est31 commented 5 years ago

Wow, that's quite a project. I just thought about taking the code from BoringSSL that you mentioned above. But I'll have a look at reading the material.

briansmith commented 5 years ago

Wow, that's quite a project. I just thought about taking the code from BoringSSL that you mentioned above. But I'll have a look at reading the material.

If you are able to import the (ISC-licensed) BoringSSL code and then get it working without adding lots of OpenSSL-licensed C code back to ring then we can definitely explore that option as a first step.

est31 commented 5 years ago

Okay, I've had a look. Haven't read everything A-Z but enough to know how to write the code.

The paper compares several heuristic primality tests and libraries that use those tests, and looks how well they fare under adversarial conditions. It recommends the use of the Baillie-PSW primality testing algorithm, Which is basically Miller-Rabin plus Lucas. The IETF RFC draft talks about storing provable seeds to how the program came up with the key together with the key. FIPS 186-4 Appendix B.3 includes such a method to generate probable primes, with Appendix C.3 describing both Miller-Rabin tests and Lucas tests.

I think that for now, we don't need to store those additional info that the IETF RFC talks about. Also, I don't think we need to use the actual Baillie-PSW primality testing algorithm, as the paper looks at the adversarial case, where the primes are not randomly generated but provided by adversaries. So I'd say we just adopt Miller-Rabin like the FIPS document writes about it and in general implement it that way. Then later on, Lucas primality testing can always be added as well as storing of the seed.

As for the number of Miller-Rabin rounds, the FIPS document is giving orientation in a table "Table C.2. Minimum number of rounds of M-R testing when generating primes for use in RSA Digital Signatures".

What do you think?

briansmith commented 5 years ago

As for the number of Miller-Rabin rounds, the FIPS document is giving orientation in a table "Table C.2. Minimum number of rounds of M-R testing when generating primes for use in RSA Digital Signatures".

As long as we can do all the Miller-Rabin stuff in Rust code (and I think we can), this seems OK to me.

Usually M-R is preceded by a check that the candidate being tested isn't divisible by some small primes. Also FIPS 186-4 requires that e and the candidate are relatively prime. (Since we only need to generate keys for e = 65537 and 65537 is prime, that means we just need to check that the candidate isn't divisible by 65537.) I think both of these are mostly covered by the ISC-licensed C code in BoringSSL linked above. In that case, those two checks would have to be wrappers around C code.

I suggest we start with the (pure Rust) pure M-R tests first, and then improve them in subsequent commits with the divisibility by small primes (including e = 65537) pre-checks.

jprider63 commented 5 years ago

If it's helpful, I've implemented Miller-Rabin in Rust here. It uses a different representation of bigints, but it might be useful as a starting point. It also checks for divisibility by small primes and runs the Fermat primality test.

est31 commented 5 years ago

@briansmith so it seems that the enhanced miller-rabin test that the C code you linked to implements, uses the gcd algorithm. This gcd algorithm then works in terms of boringssl's. BIGNUM. I don't think I want to reimplement it in Rust, as it also uses division and it's a LOT of code in general.

Do you know how to convert from it's ring analogon (I suppose bigint::Elem<MM> ?) to BIGNUM and back?

est31 commented 5 years ago

Or alternatively, we could just use the non-enhanced miller-rabin test. Not sure what the advantages/disadvantages of the two are.

est31 commented 5 years ago

So more precisely, I want to do something like this, but have no idea how the conversion should work:

/// Returns the greatest common divisor of a and b
fn gcd<MM>(a: &bigint::Elem<MM>, b: &bigint::Elem<MM>) -> bigint::Elem<MM> {
    extern "C" {
        fn BN_gcd(...);
    }
    unsafe {
        Bn_gcd(...)
    }
}
briansmith commented 5 years ago

Definitely do not rewrite bn_gcd_consttime() or anything it calls in Rust. You can, however, change those functions to make them easier to use in ring. You'll have to do so, since ring doesn't have BIGNUM any more.

If you are going to use the BoringSSL "_extra.c" code to start with, please make the first commit in your PR be the unmodified* versions of those files.

The prototype of BN_gcd is:

int BN_gcd(BIGNUM *r, const BIGNUM *x, const BIGNUM *y, BN_CTX *ctx);

Change it to:

int BN_gcd(BN_ULONG r[], const BN_ULONG x[], const BN_ULONG y[], size_t num_limbs);

This is relatively straightforward to do; look at the code for bn_gcd_consttime and notice whenever it is doing ${some_variable}->d, d is a BN_ULONG[]. Also note that the purpose of the bn_resize_words stuff is to ensure that all the inputs are the same size; you can just assume this is the case in ring, as this is an invariant of ring::rsa::bigint.

Note also that BN_enhanced_miller_rabin_primality_test is not used during prime generation in BoringSSL; it is only used as an additional "FIPS-only" check on the final result.

briansmith commented 5 years ago

I see that bn_gcd_consttime uses BN_CTX too:

  BN_CTX_start(ctx);
  BIGNUM *u = BN_CTX_get(ctx);
  BIGNUM *v = BN_CTX_get(ctx);
  BIGNUM *tmp = BN_CTX_get(ctx);

This means that u, v, and tmp are three temporary variables. Since ring doesn't have BN_CTX, you'll need to pass three extra arrays to these functions: BN_ULONG u[], BN_ULONG v[], BN_ULONG tmp, constructed using Modulus::zero().

est31 commented 5 years ago

Oh, then I'll start with the non-enhanced variant first.

est31 commented 5 years ago

@briansmith I've opened #731 for further discussion, allowing me to show my code.

briansmith commented 5 years ago

@est31 You asked how to do subtraction in Rust. If I understand correctly you need non-modular subtraction. You found limb_sub already. If you really need it, you can expose this by writing a function LIMBS_sub in limb.c in the same style as the other functions in that file. Then expose it to rust as an exstern "C" function.

However, why do you need to do a subtraction?

Note that sometimes we can just use the modular subtraction (a - b) mod m as a regular subtraction a - b when we know a > b.

est31 commented 5 years ago

However, why do you need to do a subtraction?

https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test#Computational_complexity

"write n − 1 as 2r·d with d odd by factoring powers of 2 from n − 1"

I need to compute w-1 in order to do this.

est31 commented 5 years ago

New PR: #733

briansmith commented 5 years ago

"write n − 1 as 2r·d with d odd by factoring powers of 2 from n − 1"

n is odd so n - 1 is the same as n with its lowest-order bit masked off, right?

est31 commented 5 years ago

@briansmith oh right.

est31 commented 5 years ago

Since we only need to generate keys for e = 65537 and 65537 is prime, that means we just need to check that the candidate isn't divisible by 65537.

@briansmith do you know how to do this? I can't use bn_mod_u16_consttime because 65537 is the maximum value for u16 + 2 :).

briansmith commented 5 years ago

@briansmith do you know how to do this? I can't use bn_mod_u16_consttime because 65537 is the maximum value for u16 + 2 :).

Off the top of my head, I don't. Let's get it working without that check in the meantime.