Closed songlingatpan closed 1 month ago
I will commit another change for copy_from_upstream.
@baentsch Question: 'bike', 'frodokem', 'ntruprime' is not from upstream.
To replace malloc with OQS_MEM_malloc, where should i modify the code? Thanks
liboqs/src/kem/frodokem/kem_frodokem976shake.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem640aes.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem1344shake.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem976aes.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem1344aes.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem640shake.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/kem.h:264:OQS_API void OQS_KEM_free(OQS_KEM *kem); liboqs/src/kem/ntruprime/kem_ntruprime_sntrup761.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/kem.c:493:OQS_API void OQS_KEM_free(OQS_KEM *kem) { liboqs/src/kem/bike/kem_bike.c:9: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/bike/kem_bike.c:34: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/bike/kem_bike.c:59: OQS_KEM *kem = malloc(sizeof(OQS_KEM));
Question: 'bike', 'frodokem', 'ntruprime' is not from upstream.
Good point. @dstebila : There's surely upstreams for these algorithms, but they're not captured by OQS automation/copy_from_upstream. Question: Are these projects still maintained (and where) or is it OK to change code straight in liboqs
for these algs? If the former, what would it take to bring them in via "copy_from_upstream"? If the latter, should we remove them completely from OQS?
Question: 'bike', 'frodokem', 'ntruprime' is not from upstream.
Good point. @dstebila : There's surely upstreams for these algorithms, but they're not captured by OQS automation/copy_from_upstream. Question: Are these projects still maintained (and where) or is it OK to change code straight in
liboqs
for these algs? If the former, what would it take to bring them in via "copy_from_upstream"? If the latter, should we remove them completely from OQS?
FrodoKEM is maintained at https://github.com/microsoft/PQCrypto-LWEKE/; I exported the code from there any manually added it to liboqs, as we didn't have as robust a copy_from_upstream at the time. FrodoKEM also needs to do an update from upstream, as there have been new variants introduced in the last year, but I don't have a plan for this update. So to avoid blocking on that, I would say it's fine to make the FrodoKEM changes directly here in this repository.
NTRUPrime had been coming from PQClean, but they have stopped supporting it. We only are keeping one variant of NTRUPrime because of its use in OpenSSH. I think we consider a timeline for sunsetting it. But in the interim, I think changes to NTRUPrime can be done directly here in this repository.
BIKE was contributed directly by the team at AWS. Our main contact for that has been @dkostic, but I'm not sure if he's still the right contact. Pinging @brian-jarvis-aws for some input.
@dstebila @baentsch @bhess @jschanck Code change for openssl allocator has been done. Please review the code.
In addition, there are quite some error handling and potential memory leak. Would you like to fix them in the current PR? or create a separate PR for review?
Thanks
baentsch requested changes
done
In addition, there are quite some error handling and potential memory leak. Would you like to fix them in the current PR? or create a separate PR for review?
I'd prefer to see those as a second PR, since those changes may be less mechanical and might require a closer look.
I suggest we defer merging this until after the 0.11.0 release, otherwise we would need to cut a new release candidate and push the release back a week.
@dstebila @baentsch @bhess @jschanck This PR is ready for review and merge.
Thanks for this work, @songlingatpan!
How would you feel about dropping the OQS_MEM_free
function and instead using OQS_MEM_insecure_free
(presumably modifying it to use OPENSSL_free
)? I think it's a good idea to have a clear separation between "insecure" and "secure" frees, and we lose that if we introduce a not-explicitly-insecure free function with the same signature as OQS_MEM_insecure_free
.
After this lands, we may want to add some sort of static analysis to guard against raw malloc
or free
calls. Seems like something related to https://github.com/open-quantum-safe/liboqs/issues/1868.
Thanks for this work, @songlingatpan!
How would you feel about dropping the
OQS_MEM_free
function and instead usingOQS_MEM_insecure_free
(presumably modifying it to useOPENSSL_free
)? I think it's a good idea to have a clear separation between "insecure" and "secure" frees, and we lose that if we introduce a not-explicitly-insecure free function with the same signature asOQS_MEM_insecure_free
.
Addressed the comment in the latest PR. Please review it.
After this lands, we may want to add some sort of static analysis to guard against raw
malloc
orfree
calls. Seems like something related to #1868.
Correct. We should phase out direct calls to malloc, free, calloc, strdup, and realloc.
After this lands, we may want to add some sort of static analysis to guard against raw
malloc
orfree
calls. Seems like something related to #1868.Correct. We should phase out direct calls to malloc, free, calloc, strdup, and realloc.
test_free
in tests/test_code_conventions.py
does this for free calls using an ad hoc script. It may be better to do this with CodeQL in the long run.
test_code_conventions.py
Addressed the comment by adding code in test_code_conventions.py.
@dstebila @baentsch @SWilson4 Is there anything else I need to address before we proceed with merging this PR?
Thanks!
@dstebila @baentsch @SWilson4 Is there anything else I need to address before we proceed with merging this PR? Thanks!
Please take a look at my comments here and here and let me know what you think.
It seems the conflicts issue only occurs for certain openssl version/platform. Didn't see conflicts any more after revert.
@SWilson4 Addressed your 2 comments. Please let me know if I need to run copy_from_upstream.py or any other scripts to regenerate code before merge. If so, what is the detail steps.
@dstebila @baentsch @SWilson4 I have addressed existing comments. Please let me know if we can merge the PR. Thanks
if I need to run copy_from_upstream.py or any other scripts to regenerate code before merge.
It's always a good idea to run those to preclude CI finding issues. See e.g. this CI workflow. @SWilson4 : Maybe a sensible addition to CONTRIBUTING.md?
Please let me know if we can merge the PR.
@songlingatpan please see a single comment I just added (creating an issue to track a problem pointed out by @SWilson4 )
@SWilson4 would you please mark all your comments resolved that you consider such? Makes reviewing reviews easier (at least for me).
Please let me know if we can merge the PR.
@songlingatpan please see a single comment I just added (creating an issue to track a problem pointed out by @SWilson4 )
@SWilson4 would you please mark all your comments resolved that you consider such? Makes reviewing reviews easier (at least for me).
No need additional code changes for upstream: python3 copy_from_upstream.py copy && ! git status | grep -i modified Updating /home/shan/oqs_github/liboqs/docs/algorithms/kem/bike.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/kem/classic_mceliece.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/kem/frodokem.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/kem/hqc.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/kem/kyber.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/kem/ml_kem.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/kem/ntruprime.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/sig/cross.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/sig/dilithium.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/sig/falcon.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/sig/mayo.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/sig/ml_dsa.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/sig/sphincs.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/sig_stfl/lms.md Updating /home/shan/oqs_github/liboqs/docs/algorithms/sig_stfl/xmss.md Updating README.md
@SWilson4 Please let me know if anything else I need to address in this PR.
@SWilson4 Please let me know if anything else I need to address in this PR.
As I commented here, there are still a number of checks that need to be added after OQS_MEM_malloc
(e.g., in the SHA2 code).
OQS_MEM_malloc
@SWilson4 Thanks. I have another PR which does the check and memory leak in https://github.com/open-quantum-safe/liboqs/pull/1929/files# Would you like to merge the two PRs or keep them separately? The current one is for customize memory allocator. Please let me know.
OQS_MEM_malloc
@SWilson4 Thanks. I have another PR which does the check and memory leak in https://github.com/open-quantum-safe/liboqs/pull/1929/files# Would you like to merge the two PRs or keep them separately? The current one is for customize memory allocator. Please let me know.
I would prefer to address the issues introduced by this PR in the same PR—i.e., add all checks necessitated by the removal of the checked_malloc
functions in this PR. The additional checks are fine to defer to the next PR, but I would rather this PR not introduce new issues to fix.
Hi @songlingatpan—just one more quick point regarding the replacement of the checked functions. Once that and the CI failures are fixed this is good to go from my point of view. @SWilson4 Need to discuss with you in email. Thanks
@SWilson4 @baentsch @baentsch
Because eventually we need to remove all abort()/exit() from the library, currently I think it is fine to fully get rid of abort()/exit() even if subsequence is dereference NULL pointer.
This is a trivial thing I think we shouldn't waste so much time on it. We have so many critical issues to make liboqs stable, and would like to complete them more efficiently.
Please let me know if we really need to keep abort()/exit() for now.
Thanks
I think it is fine to fully get rid of abort()/exit() even if subsequence is dereference NULL pointer.
If this is meant to state that the code should rather dump core than abort()s in a controlled manner (?), I'm not sure this is a good idea. I'd rather suggest tackling that as part of resolving https://github.com/open-quantum-safe/liboqs/issues/1456 (controlled error-exits throughout the library).
I think it is fine to fully get rid of abort()/exit() even if subsequence is dereference NULL pointer.
If this is meant to state that the code should rather dump core than abort()s in a controlled manner (?), I'm not sure this is a good idea. I'd rather suggest tackling that as part of resolving #1456 (controlled error-exits throughout the library).
This is between a rock and a hard place. In our in-house version, after add all proper NULL checks and memory releases, and remove all abort/exit, it can work without crash at least. Basically it is not usable to have abort/exit in a library. There is no single abort/exit in openssl library code except test code. Anyway, I will add abort() back.
@SWilson4 @baentsch Addressed the latest comments about abort() in 1e4201b Please let me know if we are good to merge the PR. Thanks
@SWilson4 Thanks for your time and all the comments.
@baentsch @dstebila Is there anything you would like me to address in this PR? Thanks
@baentsch @dstebila Is there anything you would like me to address in this PR? Thanks
Apologies @songlingatpan -- this PR is just too large for me to take a "quick look" to approve and I have a large list of other things right now demanding urgent attention. Tagging @open-quantum-safe/liboqs-committers to help out reviewing and approving this.
@songlingatpan most of this LGTM and improves the software (malloc->OQS_MEM_malloc & additional NULL checks). Some things I don't understand (see single comments). What I'm really unsure about is the apparent removal of (many) user-visible allocation error warnings for silent abort()s. Isn't that a step in the wrong direction? Yes, the abort() is a stupid pattern to begin with, but isn't it reasonable to let users at least know why the lib/app suddenly stopped?
@baentsch I understand your point.
There was some discussion with @SWilson4 about removing the checked malloc logic. If you review the OpenSSL code, it doesn’t use the checked malloc/abort() pattern. In general, a well-designed library should avoid including abort() or exit() calls, as these decisions should be left to the caller. Ideally, malloc failures should be handled consistently across the codebase by returning error codes, rather than aborting or exiting.
Since there is no fprintf logging for non-checked malloc failures, I didn’t add stderr output for abort(). I recommend fully removing abort() in the 0.12.0 release.
Our in-house version is much more stable than 0.11.0 and target for production readiness compared to the official liboqs repository. However, I’m facing challenges in integrating our in-house code changes back into the official liboqs due to abort(), and so on.
Anyway I can add fprintf back for abort.
Anyway I can add fprintf back for abort.
That'd be nice. And for the avoidance of doubt: My ask (to achieve this) is basically to reduce the amount of new code (by retaining "OQS_MEM_checked_malloc" calls), not to add more new code (fprintf statements, I guess)
Anyway I can add fprintf back for abort.
That'd be nice. And for the avoidance of doubt: My ask (to achieve this) is basically to reduce the amount of new code (by retaining "OQS_MEM_checked_malloc" calls), not to add more new code (fprintf statements, I guess)
@baentsch We can remove the checked malloc for now. In the next step, we can remove abort and add return proper error and fix potential memory leaks. If you target to the production, eventually we need to remove the checked malloc and do proper error handling and release resources, etc.
Anyway I can add fprintf back for abort.
That'd be nice. And for the avoidance of doubt: My ask (to achieve this) is basically to reduce the amount of new code (by retaining "OQS_MEM_checked_malloc" calls), not to add more new code (fprintf statements, I guess)
@baentsch We can remove the checked malloc for now. In the next step, we can remove abort and add return proper error and fix potential memory leaks. If you target to the production, eventually we need to remove the checked malloc and do proper error handling and release resources, etc.
Please let me know if you still want to keep the checked malloc. Still, if you check openssl code, there is no checked malloc. We need to remove it eventually.
@baentsch Please review the PR.
Thanks
Replaced malloc, calloc, strdup, and free with the OpenSSL memory allocator to enable the caller to customize memory allocator, addressing issue #1823. This PR does not change the existing behavior or algorithms.