ml-kem: seed support for `DecapsulationKey`

tarcieri commented 3 weeks ago

Seeds provide a shorter secret which is always valid as opposed to having to be validated.

Some have suggested seeds should be the only API for instantiating an ML-KEM decapsulator: https://words.filippo.io/dispatches/ml-kem-seeds/

tarcieri commented 3 weeks ago

As a data point for comparison, the libcrux-ml-kem API is feature-gating the unpacked API: https://github.com/cryspen/libcrux/pull/522

str4d commented 3 weeks ago

The LAMPS WG at IETF seems to be heading towards seeds-only. See https://mailarchive.ietf.org/arch/msg/spasm/OxnYtr1mIzB3GejYswduSfkEIA4/ and earlier emails in the thread (later emails go off topic into RSA-land).

tarcieri commented 2 weeks ago

BoringSSL has moved to using seeds and only seeds: https://boringssl-review.googlesource.com/c/boringssl/+/70407

supinie commented 6 days ago

Is anyone actively working on this at the moment (@bifurcation)? If not, I would be happy to do so.

bifurcation commented 6 days ago

I have not picked this up. @supinie if you want to take a stab at a PR, I would be happy to review.

tarcieri commented 6 days ago

One problem is I haven't managed to find any test vectors, though a few people have claimed they are interested in working on them soon.

I've also been curious if a KDF could be leveraged to provide shorter, more secure seeds: https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/1r6FnG0coiM/m/I9_Jn5lJDQAJ

bifurcation commented 6 days ago

Actually, the NIST key generation test vectors are already framed in terms of (d, z) seeds.

So this PR might be as simple as making DecapsulationKey::generate_deterministic public and not feature-conditioned. Or maybe having a public wrapper that has a 64-byte input and splits it into the two 32-byte values.

supinie commented 6 days ago

Or maybe having a public wrapper that has a 64-byte input and splits it into the two 32-byte values.

This is what I had in mind. Have we decided if this will be replacing the current API or be additional? Alternatively, we could have a feature flag to toggle between whether the public API will accept seeds or keys?

tarcieri commented 6 days ago

@supinie it should absolutely replace the existing API.

I guess the remaining question is the specific seed format, although multiple could be supported with the specific one inferred from length.

bifurcation commented 6 days ago

I was going to argue the opposite direction :) That since FIPS 203 defines both formats, we should support both.

As far as seed format, the (d, z) approach we have now actually seems right to me, (a) because it is in tune with what FIPS 203 says [1], and (b) because it seems like any format should be parseable to obtain those values.

[1] Page 17, "The seed $(d, z)$... can be stored for later expansion"

tarcieri commented 6 days ago

I think it would be OK to keep the existing API under a feature-gate (possibly hazmat) but it permits misuses which aren't possible with the seed-based API. See the BoringSSL example.

bifurcation commented 6 days ago

If we added the required validation checks and had from_bytes() return Result<Self>, that would be safe; doesn't seem like it would merit the hazmat negging. (Cf. this comment) But I admit that I'm more in the "APIs should offer complete capabilities" camp than the "APIs should be very opinionated" camp.

Maybe a compromise could be: Repurpose to_bytes() / from_bytes() to go to/from the seed, and add to_expanded_bytes() / from_expanded_bytes() for the full form. With the idea that the _bytes variants will be more obvious/attractive to developers.

As an aside: The BoringSSL example reminds me that if we're going to have from_seed(), we should probably also have to_seed(). Which means we'll need to carry around d in kem::DecapsulationKey.

RustCrypto / KEMs

ml-kem: seed support for `DecapsulationKey` #53