Closed MarcusJGStreets closed 1 month ago
The branch is very stale, and overwrites many commits since made into the main branch of the repository. I am not sure this will be suitably repaired by merging main
into the development branch (though that might be worth a try); rebasing the commits on top of main
might be better, or cherry-picking them onto a new branch?
I am not convinced that the proposed API for ML-DSA properly aligns with the specification of the ML-DSA and HashML-DSA operations. Two primary concerns:
In ML-DSA, the application Message is appended to tr, a hash of the public key, a domain separator (0x00
), and the context string. This concatenation is then hashed into the 64-byte 'message representative'. The hash function is fixed by the specification.
This does not fit the hash-and-sign pattern. It certainly does not lend itself to having the application hash the message prior to being signed by a psa_sign_hash()
function.
In HashML-DSA, the application message is replaced by the digest of the message, using 'an approved hash function'. The message representative is then formed by hashing the concatenation of tr, the domain separator (0x01
), the context string, the OID of the message-hashing function, and the message hash. Signature then proceeds as per ML-DSA.
This does fit the hash-and-sign pattern in the Crypto API, although the message-hashing algorithm must be available to psa_sign_hash()
in order to include the correct OID in the concatenation.
This suggest to me that we should have two algorithms, one for ML-DSA and one for HashML-DSA. The one for ML-DSA is only usable with psa_sign_message()
, and is not parameterized by a hash algorithm. The one for HashML-DSA is parameterized by the message-hashing algorithm, and can be used with both psa_sign_message()
(which does the hashing), and psa_sign_hash()
which expects the the hash to be provided by the caller.
FIPS 204 describes the default "hedged" signature process, that uses 32-bytes of fresh randomness to generate a distinct signature for each invocation (with the same key, context, and message); but permits a deterministic variant where this value is all zeros.
We need two versions of each of the ML-DSA and HashML-DSA algorithms, a default one for the hedged version, and a xxx_DETERMINISTIC_xxx
on for the deterministic version. This mirrors the pattern with ECDSA and DETERMINISTIC_ECDSA.
There is no discussion in the PR yet about the 0-255 byte context string, present in the FIPS 204 specification. NIST suggests a default context value of the empty string, but "applications may specify the use of non-empty strings".
We should be explicit that the initial API only supports an empty context string. When an application use case arises that requires a non-empty context, we will need to add an API for providing a context to the signature and verification functions.
In parallel, I was writing up my thoughts on ML-KEM in #95. See my latest comment.
Adding ML keys now that FIPS 203 and 204 have been issued.