Burt's feedback: should we define pre-hashed modes for composite?

ounsworth commented 2 months ago

https://mailarchive.ietf.org/arch/msg/spasm/vzomXUHvJOouzIcC82WkMF4DP-g/

Should the domain separator include an initial byte that identifies the type of domain separator, again similar to NIST’s definition? For instance, the value 1 could indicate that the message is pre-hashed as currently proposed in draft-ietf-lamps-pq-composite-sigs. A different value could support another option: the message is not pre-hashed, but instead is prepended with a domain separator, and then passed to the two signature algorithms. That option would avoid the need for an additional hashing operation to be specified. (The domain separator could still include an OID for the combination of the two signature algorithms in order to separate different combinations of algorithms.)

If an underlying signature algorithm supports pure and pre-hash modes, which mode should be used with the composite signature construction? Presumably pure mode when the composite construction includes pre-hashing, because the message will already have been hashed by the time it reaches the underlying signature algorithm, but this should be stated explicitly.

In addition to pre-hashing as currently proposed, should there be an option for including a randomizer and/or the signers’ public keys in the input to the pre-hash operation, in addition to the message? As Joe Harvey observed in comments to NIST earlier this year [3], without such an option, pre-hashing introduces a dependency on collision resistance, whereas the security of the underlying signature algorithm may be based on other security assumptions (e.g., target collision resistance, second preimage resistance). Moreover, a collision, if found, could potentially be used against multiple users, whereas the underlying signature algorithm may have been designed to provide security in the multi-user model. (This is not an argument for reducing the size of the hash function output by using randomization, but rather for considering that the use of pre-hashing may change the security assumptions compared to the underlying algorithm, and providing the protocol designer a way to revert to the original assumptions.)

ounsworth commented 2 months ago

Fair point, currently the composite draft presents a pure-sign mode, and uses ML-DSA in its pure-sign mode. But with NIST deciding to support both pure and pre-hash modes of ML-DSA, we should probably do that same for composite-ML-DSA.

This represents a fairly large amount of design work since all the points raised by Burt, plus probably more, are in-scope.

ounsworth commented 2 months ago

I think the point about prefix byte does not apply: since we are already hashing in the OID, and we would presumably define different OIDs for pure and pre-hashed modes, I think that anything you would want to capture in the prefix byte is already captured in the OID.

ounsworth commented 1 month ago

As to the title question:

should we define pre-hashed modes for composite?

My vote is No.

PiotrPopis commented 4 weeks ago

Initially I was in favor of defining pre-hash in [composite-sign] but now I also agree with Mike that we should not define pre-hash mode in composite sign. This would only increase the number of possibilities, which is not in line with the idea of interoperability and increases the implementation effort.

The current version of [composite-sign] uses the term "pre-hash" in several places, which is confusing in the context of FIPS 204. I therefore suggest that in [composite-sign] the wording "internal-layer hash" be used instead of "pre-hash". @ounsworth: if there is agreement to change the wording, I will prepare a specific proposal.

Regardless of the decision to change or keep the "pre-hash" language, I suggest adding the following text at the end of Section 3.1: (precisely: very similar wording, taking into account the location of ctx, i.e. consistent with the entire content of the new Section 3.1) (...) It should be noted that in the case of ML-DSA the calculated Hash over the original message is an "internal-layer Hash" and is different from the HASHML-DSA version specified in FIPS 204. This means that according to this specification the signature over the concatenated: selected signature scheme and calculated Hash over the original message is implemented in the case of the PQ algorithm using the PURE version of ML-DSA.

johngray-dev commented 3 weeks ago

Mike and I and the Composite authors group are having further discussions. I think we are going to land on making use of both pure and pre-hash versions of composite, using a construction that is essentially aligned with what NIST has done. Instead of using 0 or 1, we use the DER Encoding of the Composite Signature OID and call it "Domain"

CompositeML-DSA:

M' := Domain || IntegerToBytes(|ctx|, 1) || ctx || Message where “Domain” are the DER(OID) values from our table.

HashCompositeML-DSA:

M' := Domain || IntegerToBytes(|ctx|, 1) || ctx || HashOID || PH where “Domain” are the DER(OID) values from our table. where H is not specified in our table, but instead can be SHA-256, SHA-512, SHAKE128 or others in the future.

FIPS 204 suggests the specified HASH algorithm should be carried in the OID. Of course we could choose to use an Algorithm parameter for efficiency here or we get major OID explosion. I think we will just end up choose 1 appropriate HASH for the pre-hash version (similar to what is in the -02 version of the draft).

johngray-dev commented 1 week ago

We added pre-hash and pure modes. Merged in pull #59

lamps-wg / draft-composite-sigs

Burt's feedback: should we define pre-hashed modes for composite? #34