Improve (heap) memory consumption

baentsch commented 1 year ago

Profiling runs for TLS handshaking showed a marked increase in heap usage when switching from oqs-openssl111 to oqsprovider+openssl3. Most likely, things like this need to be improved. In general, review all uses of OPENSSL_*_malloc in oqsprovider.

thb-sb commented 10 months ago

Hi @baentsch!

I'm currently trying to improve memory usage around the bit of code you mentioned, and I have a question about oqs_sig_digest_signverify_* methods.

https://github.com/open-quantum-safe/oqs-provider/blob/cc3895f208730abfe1efc15f7c3ff4358ac6e4c0/oqsprov/oqs_sig.c#L451-L453

When oqs_sig_digest_signverify_init is called (from either OSSL_FUNC_SIGNATURE_DIGEST_SIGN_INIT or OSSL_FUNC_SIGNATURE_DIGEST_VERIFY_INIT), mdname may be NULL.

In the case where mdname is non-null, it's the digest of the input data (given to oqs_sig_digest_signverify_update) that is signed/verified. It means that at the end, in case of an hybrid scheme signature operation, we end up with the following (simplified) workflow:

// digest_input computed on-the-fly with `oqs_sig_digest_signverify_update` and `mdname`.
digest_input = hash(mdname, input_data)

digest_input_nist = hash(md_nist_level, digest_input)
classical_sig = sign(classical_sig_scheme, digest_input_nist)
pq_sig = sign(pq_sig_scheme, digest_input)
return len(classical_sig) + classical_sig + pq_sig

However, if mdname is null, then input data are directly forwarded to the post-quantum signature scheme, and we end up with the workflow below:

digest_input_nist = hash(md_nist_level, input_data)
classical_sig = sign(classical_sig_scheme, digestg_input_nist)
pq_sig = sign(pq_sig_scheme, input_data)
return len(classical_sig) + classical_sig + pq_sig

In this last workflow, input data are being kept in memory until we call the post-quantum signature scheme with.

My question is: is there anything preventing us from always hashing the input data? Can't we do something like if mdname is null, then mdname equals the md matching the value of oqs_key->claimed_nist_level, and then we hash the input data?

baentsch commented 10 months ago

First of all: Thanks for taking a look at this!

is there anything preventing us from always hashing the input data?

Technically, no. From a "code logic" perspective it is very clean. The reason why I didn't do it from the start was that I'm not sure what the OpenSSL core will do with this: If it is guaranteed calling sign (without digest) when no digest has been specified (e.g., when requesting certificate generation), we'd be "safe" from an implication that I was always concerned with, namely (lack of) interoperability: The way things are now are exactly as they had been working on OQS-OpenSSL111 and that some interop tests expect.

(Always) doing a digest (if indeed needed) "under the hood" surely would require a different OID (as the output changes) -- but this (matching hash strength with PQ alg strength) still is more manageable than the proposal in #239 where any hash-and-sign algorithm combination (with loads of OIDs) was proposed.

So the question seems to be whether it is possible to drive OpenSSL to create "hash-free" certificates without the need to handle the "mdname==NULL" case in "digest_sign". In the back of my mind, I think the answer is "No" -- but it would be good if you'd take another look.

Finally, not really being a cryptographer, I felt "uncomfortable" silently adding a non-PQ algorithm (even if it's only hashing) invisibly in front of an otherwise "pure" PQ algorithm (designed to not need this). But that concern is one that @dstebila or @cryptojedi may help "put to bed": Feedback welcome!

dstebila commented 10 months ago

Always hashing the input data, and then passing it to the underlying signature algorithm, results in a different overall signature scheme then just calling the underlying signature algorithm: there's an extra hash in there. So if we did that in front of Dilithium, it would no longer be Dilithium, it would be Dilithium-with-an-extra-hash, and therefore not interoperable with Dilithium, necessitating different algorithm identifiers and so on, as Michael points out.

Dilithium itself hashes the message, and does so with some extra inputs (including the public key as input to the hash, if memory serves), which adds certain types of robustness properties. There has been some discussion on the LAMPS mailing list about pre-hashing and whether it's a good idea or not; I believe people with HSMs like pre-hashing because otherwise you have to feed the full message into the HSM for Dilithium. I don't know what the resolution on that issue was -- I think there are some research questions to be had here, and @praveksharma was planning to take a look at it, but hasn't had a chance to do so yet.

baentsch commented 10 months ago

Thanks for the feedback, @dstebila !

I believe people with HSMs like pre-hashing because otherwise you have to feed the full message into the HSM for Dilithium.

Completely understandable given an HSM's technical bandwidth limitations. So then let's wait and see whether the standard evolves to prioritize those bandwidth limitations over "certain types of robustness properties" (that our implementation currently gives precedence).

open-quantum-safe / oqs-provider

Improve (heap) memory consumption #155