correction of KEM combiner KDF input order to achieve NIST compliance

falko-strenzke commented 4 months ago

The ordering of the KEM combiner inputs must be adjusted in the fashion of

SHA3-256(counter || ss1 || ss2 || <further-inputs>)

(see https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-56Cr2.pdf#page=22), i.e., after the counter first the shared-secrets must be fed and then any other inputs.

This issue was raised both in https://github.com/openpgp-pqc/draft-ehlen-openpgp-nist-bp-comp/issues/6 and https://github.com/lamps-wg/draft-composite-kem/issues/26.

falko-strenzke commented 4 months ago

My concrete proposal is to make the new combiner in this form:

domSep = "OpenPGPCompositeKDFv1"

KEK = SHA3-256(counter || ecdhKeyShare || mlkemKeyShare || ecdhCipherText ||
 ecdhPublicKey  || mlkemCipherText || mlkemPublicKey || domSep || len(domSep) 
 || algID )

with len(domSep) being a single octet.

In any case, the string must be uniquely parsable from the rear end for any conceivable future value of domSep, which is not the case in our current proposal.

@wussler @TJ-91

TJ-91 commented 4 months ago

I think it's good like this, at least I don't see any problem with it, or any benefit in a different ordering.

In any case, the string must be uniquely parsable from the rear end for any conceivable future value of domSep, which is not the case in our current proposal.

We should also note these facts in the security considerations. (1) this KDF has no possible collisions of inputs since it's uniquely parseble from the rear end (2) future versions of the KDF must also make sure of this.

falko-strenzke commented 4 months ago

I think our idea with the current construction in which the tail of the KDF input is

fixedInfo = algID || "OpenPGPCompositeKDFv1",

is that one can infer the length of the whole string from the end of the string "...v1". But that mechanism seems a bit brittle ~~(one would have to properly parse the encoded string from the rear)~~ [quite nonsensical comment, as we will never parse the string ...]. I would prefer to have an explicit length encoding. ~~But my above claim that it is not uniquely parsable from the rear is not fully justified in this view.~~ [No, I justified it now, see my comment after TJ-91's comment]

TJ-91 commented 4 months ago

But my above claim that it is not uniquely parsable from the rear is not fully justified in this view.

I did not check before, but you are right of course and fixedInfo = algID || "OpenPGPCompositeKDFv1" is fine. As long as any other version of the KDF that introduces a new domSep value ensures that the encoding of different values is suffix-free (i.e., the new version does not end with "OpenPGPCompositeKDFv1")

But that mechanism seems a bit brittle (one would have to properly parse the encoded string from the rear). I would prefer to have an explicit length encoding.

I agree with the length encoding for domSep. Using domSep || len(domSep) || algID is safe in my estimation, as long as a new KDF version does not change this structure at the end of the SHA3 input string and only changes the value of domSep. This means the new proposal is a bit more robust against unfortunate choices of future domSep values (however, the designer of the new KDF version would not likely choose a string that ends with "OpenPGPCompositeKDFv1")

One point that I just now thought of is the remote possibility of adding multi-byte algorithm identifiers (which was discussed as a possibility on several occasions). In that case the inputs can collide if the algorithm identifier comes last. Thus, it might be better to put domSep || len(domSep) last and then the new algorithms can unambiguously be used with a new KDF version. Or add a length byte for the algorithm identifier.

falko-strenzke commented 4 months ago

The actual problem why I think the right encoded length should be appended is that the string is not necessarily prefix-free: currently we are using "OpenPGPCompositeKDFv1", but if someone came up with "[X]OpenPGPCompositeKDFv1", and "[X]" happens to represent a valid algorithm ID, there would be a problem.

openpgp-pqc / draft-openpgp-pqc

correction of KEM combiner KDF input order to achieve NIST compliance #127