openpgp-pqc / draft-openpgp-pqc

Repository of the WIP draft-ietf-openpgp-pqc
Other
8 stars 2 forks source link

SHA-3 vs KMAC for KDF #46

Closed teythoon closed 7 months ago

teythoon commented 1 year ago

Are there specific reasons for preferring KMAC over HKDF as key combiner? Seeing that v6 OpenPGP now uses HKDF everywhere, it seems tempting to also use it as combiner to reduce the number of required algorithms, no?

falko-strenzke commented 1 year ago

The problem, as far as I understand it, is that HKDF is HMAC-based and HMAC does not qualify as a dual-PRF, which is what is needed when a secret shall be derived from two input keys: https://datatracker.ietf.org/meeting/113/materials/slides-113-cfrg-a-dual-prf-construction-00#page=8

teythoon commented 1 year ago

I don't understand. The authors of the document are indeed using HMAC, albeit with an expansion step before, and they are doing HMAC twice, once with key 1 as key input and key 2 as data, the second time with the inputs swapped for symmetry, then xoring the HMAC outputs.

In contrast, this document simply does KMAC(key1 || key2). I don't understand why that should be a dual-PRF whereas HKDF(key1 || key2) should not be one.

Please unconfuse me.

The reason I'm bringing this up is that KMAC seems not to be widely available in cryptographic libraries. A quick survey of the five cryptographic libraries that Sequoia can use paints a bleak picture:

In contrast, HKDF (and HMAC) is available in all libraries in all versions we support.

falko-strenzke commented 1 year ago

I completely understand the problem regarding the implementation aspect. Regarding the theoretical questions, I cannot address them profundly. But yes, I would also assume that an HMAC-based and thus also HKDF-based construction should be possible. It might be far from optimal in terms of efficiency though. If that doesn't matter, we could try to figure out whether this is possible.

The reason why KMAC yields a dual-PRF simply with concatenation of the keys is apparently because it is based on SHA-3. When talking about HMAC above, I refer only (and also the authors of the slides, I assume) about HMAC using SHA-2.

teythoon commented 1 year ago

I see the statement "HMAC is generally not a dual-PRF." in the slides, but I don't see the assertion that KMAC should be one.

falko-strenzke commented 1 year ago

I see the statement "HMAC is generally not a dual-PRF." in the slides, but I don't see the assertion that KMAC should be one.

Yes, sure. That work apparently does not deal with SHA-3 as dual-PRF at all. For these considerations you will have to look into the security considerations of draft-ounsworth-cfrg-kem-combiners (which is work in progress, though, and might not be up to date). This was also discussed on the CFRG mailing list, see for instance here

Edit: The repository of the KEM combiner draft is here: https://github.com/EntrustCorporation/draft-ounsworth-cfrg-kem-combiners

falko-strenzke commented 1 year ago

By the way, this is the paper to the slides mentioned above: https://eprint.iacr.org/2022/065.pdf

falko-strenzke commented 1 year ago

If we consider adopting the HMAC-based approch from the above paper, there would be quite some overhead in the implementation, as apparent from the slide at the link.

So we should ask the question, if implementing this construction is really simpler than implementing KMAC on top of SHA-3. And for the latter, indeed, there can be a problem, as I learned from my implementation of KMAC in Botan. KMAC makes use of KECCAK[c], a hash-function building block from which also SHA-3 is derived, that is most likely not readily available in most libraries. It wasn't in Botan, and I had to dig into the SHA-3 details and refactor some code to make KECCAK[c] available. This lead to a longer discussion of how the new design of the KECCAK[c]-based hash functions should look with the Botan maintainers. See

The reason for the need for refactoring is the following: Formally, SHA-3 is derived from KECCAK[c] as specified here. For instance:

SHA3-224(M) = KECCAK[448] (M || 01, 224)

Here, 01 denotes the final padding bits of SHA3. KMAC uses a different final bit padding. Now there is also a final bit padding already in KECCAK[c], which is applied prior to that of the derived hash function. However, in Botan, these final bit paddings were mixed together, which made it difficult to understand what is happing in the code exactly and required me not only to use the KMAC specification, but also to disentangle the SHA-3 implementation from its basic building block KECCAK[c].

I haven't looked into any other crypto libraries, but I would assume that at least some will have the same problem. So my conclusion is that an HMAC-based approach might indeed be simpler, even though it will require some overhead in coding the dual-PRF. But at least that will be straightforward and not require an analysis or refactoring of the underlying algorithms, as I experienced with Botan.

falko-strenzke commented 1 year ago

Note that the draft-ounsworth-cfrg-kem-combiners also allows for

teythoon commented 1 year ago

Yeah, the SHA3-based combiners seem to be easier to implement on top of existing primitives.

falko-strenzke commented 1 year ago

I've made a corresponding PR: https://github.com/openpgp-pqc/draft-openpgp-pqc/pull/48

wussler commented 1 year ago

Hey @teythoon and @falko-strenzke, sorry for being late at the party. Thanks for the feedback and PR.

I'm not the biggest fan of this change. I agree it's extra work to get KMAC right, but it does provide some practical advantages over both SHA2 and SHA3.

SHA2, as Falko has said has the issue of not being a dual-PRF or anywhere close to a split-key pseudorandom functions (as required for the formal proof in the paper Falko linked).

Then, I personally think KMAC is still superior to SHA-3:

I don't think that any implementation supporting SHA3 can't support KMAC. It might be tricky to have access to the primitives, as Falko mentioned.

As an author of draft-ounsworth-cfrg-kem-combiners I gotta add that we considered dropping the SHA3 variants and leave KMAC only, because of the previous reasons and the feedback we got from the CFRG. We decided to leave it in there because there could be some really constrained protocol that might have a good use of it and it's indeed not insecure, but just offers less. OpenPGP is IMO definitely not a constrained protocol.

Finally Nettle offers shake256 with context, and rust offers sha3::CShake256Core upon which KMAC can be built as we did in golang. OpenSSL seems indeed not to offer the bindings, and I hope to be wrong, but seems like CNG does not even offer SHA3.

Note that this is still a draft, and am happy to hear feedback from developers. If the community thinks the advantages of KMAC does not counterbalance the costs of implementing it I'm still happy to switch to the SHA3-256 variant, but not as done in #48, as I think some parts are missing there. In particular I'd include the counter increase and improve the context separation definition.

wussler commented 1 year ago

And as a small PS: since we're all gonna be at the summit, let's have a chat there about this :)

teythoon commented 1 year ago

Finally Nettle offers shake256 with context, and rust offers sha3::CShake256Core upon which KMAC can be built as we did in golang. OpenSSL seems indeed not to offer the bindings, and I hope to be wrong, but seems like CNG does not even offer SHA3.

That is good to hear, I didn't realize that, and indeed that seems like something we could steal^Wadapt. I'm sure the OpenSSL bindings can be easily extended. CNG gets Keccak-based constructions "soon", but it will be quite some time until people can rely on that being available. But in the mean time, we will augment CNG with algorithms from RustCrypto, as we do for other algorithms (notably, Ed25519... wtf CNG...)

https://blogs.windows.com/windows-insider/2023/03/23/announcing-windows-11-insider-preview-build-25324/

I guess that addresses my concern.

wussler commented 1 year ago

Hi, as said on the PR #48, I would keep this issue open in order to see if the implementation turns out to be too complex here for other parties too, or if there is any slightly more complex SHA-3 construction that could give us the same properties (for our limited use scenario) for a lower implementation cost.

Thanks for looking into this @teythoon!

ounsworth commented 1 year ago

Hey @teythoon and @falko-strenzke, @wussler, sorry for being very late at the party.

I am not going to wade into the theoreticals -- that's what Falko, Aron, and Stavros are for ;) -- but we already faced criticism that

KDF(counter || k_1 || ... || k_n || fixedInfo, outputBits)
where
k_i = H(ss_i || ct_i)

is too many invocations of H, and thus as a consolation to performance, we also offer

k_i = ss_i || ct_i

when safe.

Given this, I cannot imagine that the construction from [2022/065] will be well received given that involves the expander

F(k) = H(0||k1)||H(1||k1)||...||H(j||k1)||H(0||k2)||H(1||k2)||...||H(j||k2)||H(0||Kn)||H(1||Kn)||...||H(j||kn)

which I believe calls H() 3x for each block of input.

So I think the debate comes down to how much the dual-PRF property matters in practice. Based on discussions, I get the feeling that the answer is "not enough to be worth doing".

falko-strenzke commented 7 months ago

So far we have not seen any further requests regarding this change. @teythoon: I think you also now have implementations available, right? Can we close this issue?

teythoon commented 7 months ago

Finally Nettle offers shake256 with context

FTR, that is a misunderstanding. The context in the function signature is the context object for the Shake operation, not the context in the cryptographic context sense.

Nevertheless, I am convinced that KMAC will be the least of our problems algorithm-wise, and we'll depend on Botan which has KMAC for the PQ algorithms in the short-term anyway. So yeah, my concern is sufficiently addressed, thanks for bearing with me.