open-quantum-safe / liboqs

C library for prototyping and experimenting with quantum-resistant cryptography
https://openquantumsafe.org/
Other
1.9k stars 464 forks source link

LMS multi-tree signing crashes after first subtree is exhausted. #1966

Open cr-marcstevens opened 3 weeks ago

cr-marcstevens commented 3 weeks ago

During benchmarking of statefull signatures the only two enabled LMS multi-tree algorithms (LMS_SHA256_H5_W8_H5_W8 and LMS_SHA256_H10_W4_H5_W8) both crash during the signing benchmark loop.

It appears to only crash when the benchmark duration is sufficiently large to exhaust the first subtree. That is, for LMS_SHA256_H10_W4_H5_W8, generating 31 signatures is okay, but it crashes when generating 32 signatures.

Used command lines & output:

$ mkdir build; cd build
$ cmake -G "Unix Makefiles" -DOQS_ENABLE_SIG_STFL_XMSS=ON -DOQS_ENABLE_SIG_STFL_LMS=ON -DOQS_HAZARDOUS_EXPERIMENTAL_ENABLE_SIG_STFL_KEY_SIG_GEN=ON -DOQS_DIST_BUILD=OFF -DCMAKE_C_COMPILER=clang -DCMAKE_BUILD_TYPE=Debug -DUSE_SANITIZER=Address ..
$ make -j40

$ tests/speed_sig_stfl -d 49 -i LMS_SHA256_H10_W4_H5_W8
Configuration info
==================
Target platform:  x86_64-Linux-6.5.6-300.fc39.x86_64
Compiler:         clang (17.0.6 (Fedora 17.0.6-1.fc39))
Compile options:  [-march=native;-Wa,--noexecstack;-g3;-fno-omit-frame-pointer;-fno-optimize-sibling-calls;-fsanitize-address-use-after-scope;-fsanitize=address;-Wbad-function-cast;-Wcast-qual;-Wnarrowing;-Wconversion]
OQS version:      0.11.1-dev
Git commit:       71324732640cf4451cab425ac5d80ed9fd41a3af (+ local modifications)
OpenSSL enabled:  Yes (OpenSSL 3.1.1 30 May 2023)
AES:              NI
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  OQS_LIBJADE_BUILD USE_SANITIZER=Address OQS_OPT_TARGET=auto CMAKE_BUILD_TYPE=Debug
CPU exts compile-time:  ADX AES AVX AVX2 AVX512 BMI1 BMI2 PCLMULQDQ POPCNT SSE SSE2 SSE3

Speed test
==========
Started at 2024-10-29 10:18:53
Operation                            | Iterations | Total time (s) | Time (us): mean | pop. stdev | CPU cycles: mean          | pop. stdev
------------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
LMS_SHA256_H10_W4_H5_W8              |            |                |                 |            |                           |
keypair                              |          1 |          1.159 |     1158788.000 |      0.000 |                2896982282 |          0
sign                                 |         31 |         49.238 |     1588318.581 |  84141.762 |                3832268891 |  715795960
verify                               |      10136 |         49.002 |        4834.500 |   4113.029 |                  12086165 |   10282590
public key bytes: 60, secret key bytes: 64, signature bytes: 3860
Ended at 2024-10-29 10:20:34

$ tests/speed_sig_stfl -d 50 -i LMS_SHA256_H10_W4_H5_W8
Configuration info
==================
Target platform:  x86_64-Linux-6.5.6-300.fc39.x86_64
Compiler:         clang (17.0.6 (Fedora 17.0.6-1.fc39))
Compile options:  [-march=native;-Wa,--noexecstack;-g3;-fno-omit-frame-pointer;-fno-optimize-sibling-calls;-fsanitize-address-use-after-scope;-fsanitize=address;-Wbad-function-cast;-Wcast-qual;-Wnarrowing;-Wconversion]
OQS version:      0.11.1-dev
Git commit:       71324732640cf4451cab425ac5d80ed9fd41a3af (+ local modifications)
OpenSSL enabled:  Yes (OpenSSL 3.1.1 30 May 2023)
AES:              NI
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  OQS_LIBJADE_BUILD USE_SANITIZER=Address OQS_OPT_TARGET=auto CMAKE_BUILD_TYPE=Debug
CPU exts compile-time:  ADX AES AVX AVX2 AVX512 BMI1 BMI2 PCLMULQDQ POPCNT SSE SSE2 SSE3

Speed test
==========
Started at 2024-10-29 10:13:07
Operation                            | Iterations | Total time (s) | Time (us): mean | pop. stdev | CPU cycles: mean          | pop. stdev
------------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
LMS_SHA256_H10_W4_H5_W8              |            |                |                 |            |                           |
keypair                              |          1 |          1.159 |     1159033.000 |      0.000 |                2897595644 |          0
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2504729==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000018 (pc 0x000000989c80 bp 0x7ffe3e40f800 sp 0x7ffe3e40f160 T0)
==2504729==The signal is caused by a READ memory access.
==2504729==Hint: address points to the zero page.
    #0 0x989c80 in OQS_LMS_NAMESPACE_hss_generate_signature /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/external/hss_sign.c:652:44
    #1 0x98d295 in OQS_LMS_NAMESPACE_hss_sign_init /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/external/hss_sign_inc.c:82:20
    #2 0x8487fc in oqs_sig_stfl_lms_sign /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/sig_stfl_lms_functions.c:591:8
    #3 0x8480fe in OQS_SIG_STFL_alg_lms_sign /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/sig_stfl_lms_functions.c:92:6
    #4 0x50b8d1 in OQS_SIG_STFL_sign /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/sig_stfl.c:1024:42
    #5 0x5084e0 in sig_speed_wrapper /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/tests/speed_sig_stfl.c:114:3
    #6 0x50734b in main /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/tests/speed_sig_stfl.c:249:8
    #7 0x7f8aafc46149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 788cdd41a15985bf8e0a48d213a46e07d58822df)
    #8 0x7f8aafc4620a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2820a) (BuildId: 788cdd41a15985bf8e0a48d213a46e07d58822df)
    #9 0x42a4c4 in _start (/export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/build/tests/speed_sig_stfl+0x42a4c4) (BuildId: 454b0b10237bae477b14dc9bfa56067b6a6cee78)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /export/scratch1/home/stevens/GIT/pqc_benchmarking/liboqs/src/sig_stfl/lms/external/hss_sign.c:652:44 in OQS_LMS_NAMESPACE_hss_generate_signature
==2504729==ABORTING

Note that the local changes to the repo are minimal and do not affect this bug:

  1. Enable additional LMS algorithms
  2. Add a break statement inside the keygen benchmark loop to just generate 1 key and then continue to benchmark signing. I used this to rule out whether the bug is triggered by calling keygen many times. The bug still appears when calling keygen only once. I kept this change because it speeds up debugging by skipping long keygen benchmarking.
    $ git diff
    diff --git a/src/oqsconfig.h.cmake b/src/oqsconfig.h.cmake
    index dae1baba..1e841f13 100644
    --- a/src/oqsconfig.h.cmake
    +++ b/src/oqsconfig.h.cmake
    @@ -304,3 +304,7 @@
    #cmakedefine OQS_ALLOW_STFL_KEY_AND_SIG_GEN 1
    #cmakedefine OQS_ALLOW_XMSS_KEY_AND_SIG_GEN 1
    #cmakedefine OQS_ALLOW_LMS_KEY_AND_SIG_GEN 1
    +#cmakedefine OQS_ENABLE_SIG_STFL_lms_sha256_h20_w1 1
    +#cmakedefine OQS_ENABLE_SIG_STFL_lms_sha256_h20_w2 1
    +#cmakedefine OQS_ENABLE_SIG_STFL_lms_sha256_h20_w4 1
    +#cmakedefine OQS_ENABLE_SIG_STFL_lms_sha256_h20_w8 1
    diff --git a/tests/speed_sig_stfl.c b/tests/speed_sig_stfl.c
    index ac09ca7b..ac57655c 100644
    --- a/tests/speed_sig_stfl.c
    +++ b/tests/speed_sig_stfl.c
    @@ -105,6 +105,7 @@ static OQS_STATUS sig_speed_wrapper(const char *method_name, uint64_t duration,
                                printf("keygen error. Exiting.\n");
                                exit(-1);
                        }
    +                       break;
                        secret_key = reset_secret_key(sig, secret_key);
                })
                // benchmark sign: need to generate new secret key after available signatures have been exhausted

Also note that the allocated virtual memory usage of LMS is huge:

top - 10:04:41 up 145 days, 15:38,  3 users,  load average: 0.11, 0.39, 0.83
Tasks: 823 total,   2 running, 820 sleeping,   1 stopped,   0 zombie
%Cpu(s):  1.3 us,  0.0 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 1546757.+total, 1476665.+free,  16781.3 used,  61316.5 buff/cache
MiB Swap:  16384.0 total,  16384.0 free,      0.0 used. 1529976.+avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                       
2504709 stevens   20   0   20.0t 469120   6400 R  99.7   0.0   0:05.79 speed_sig_stfl                
ashman-p commented 3 weeks ago

@cr-marcstevens I will take a look into this.

ashman-p commented 3 days ago

Addressed by PR #1998.

% tests/speed_sig_stfl -d 49 -i LMS_SHA256_H10_W4_H5_W8
speed_sig_stfl(51199,0x202cfb840) malloc: nano zone abandoned due to inability to reserve vm space.
Configuration info
==================
Target platform:  arm64-Darwin-24.1.0
Compiler:         clang (16.0.0 (clang-1600.0.26.4))
Compile options:  [-mcpu=native;-Wa,--noexecstack;-g3;-fno-omit-frame-pointer;-fno-optimize-sibling-calls;-fsanitize-address-use-after-scope;-fsanitize=address;-Wbad-function-cast;-Wcast-qual;-Wnarrowing;-Wconversion]
OQS version:      0.11.1-dev
Git commit:       507d03009cecc30a4aed78279d08ab7f8de813f8 (+ local modifications)
OpenSSL enabled:  Yes (OpenSSL 1.1.1w  11 Sep 2023)
AES:              OpenSSL
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  OQS_LIBJADE_BUILD USE_SANITIZER=Address OQS_OPT_TARGET=auto CMAKE_BUILD_TYPE=Debug 
CPU exts compile-time:  AES SHA2 SHA3 NEON

Speed test
==========
Started at 2024-11-19 09:33:48
Operation                            | Iterations | Total time (s) | Time (us): mean | pop. stdev | High-prec time (ns): mean | pop. stdev
------------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
LMS_SHA256_H10_W4_H5_W8              |            |                |                 |            |                           |           
keypair                              |         94 |         49.370 |      525210.798 |   4673.927 |                 525210599 |    4673832
sign                                 |         61 |         49.353 |      809066.164 |  43784.960 |                 809065963 |   43784936
verify                               |      23965 |         49.002 |        2044.711 |    970.274 |                   2044650 |     970268
public key bytes: 60, secret key bytes: 64, signature bytes: 3860
Ended at 2024-11-19 09:36:16

 tests/speed_sig_stfl -d 49 -i LMS_SHA256_H5_W8_H5_W8
speed_sig_stfl(54425,0x202cfb840) malloc: nano zone abandoned due to inability to reserve vm space.
Configuration info
==================
Target platform:  arm64-Darwin-24.1.0
Compiler:         clang (16.0.0 (clang-1600.0.26.4))
Compile options:  [-mcpu=native;-Wa,--noexecstack;-g3;-fno-omit-frame-pointer;-fno-optimize-sibling-calls;-fsanitize-address-use-after-scope;-fsanitize=address;-Wbad-function-cast;-Wcast-qual;-Wnarrowing;-Wconversion]
OQS version:      0.11.1-dev
Git commit:       507d03009cecc30a4aed78279d08ab7f8de813f8 (+ local modifications)
OpenSSL enabled:  Yes (OpenSSL 1.1.1w  11 Sep 2023)
AES:              OpenSSL
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  OQS_LIBJADE_BUILD USE_SANITIZER=Address OQS_OPT_TARGET=auto CMAKE_BUILD_TYPE=Debug 
CPU exts compile-time:  AES SHA2 SHA3 NEON

Speed test
==========
Started at 2024-11-19 09:43:05
Operation                            | Iterations | Total time (s) | Time (us): mean | pop. stdev | High-prec time (ns): mean | pop. stdev
------------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
LMS_SHA256_H5_W8_H5_W8               |            |                |                 |            |                           |           
keypair                              |        406 |         49.003 |      120697.039 |   2169.650 |                 120696921 |    2169603
sign                                 |        130 |         49.018 |      377064.923 |  35585.543 |                 377064838 |   35585534
verify                               |      12401 |         49.001 |        3951.385 |   1247.185 |                   3951331 |    1247183
public key bytes: 60, secret key bytes: 64, signature bytes: 2644
Ended at 2024-11-19 09:45:33

tests/speed_sig_stfl -d 49 -i LMS_SHA256_H5_W8_H5_W8
Configuration info
==================
Target platform:  arm64-Darwin-24.1.0
Compiler:         clang (16.0.0 (clang-1600.0.26.4))
Compile options:  [-march=armv8-a+crypto;-Wall;-Wextra;-Wpedantic;-Wno-unused-command-line-argument;-Wa,--noexecstack;-O3;-fomit-frame-pointer;-Wbad-function-cast;-Wcast-qual;-Wnarrowing;-Wconversion]
OQS version:      0.11.1-dev
Git commit:       764f62d9add2742449d4b94113e514f4a72c3364
OpenSSL enabled:  Yes (OpenSSL 1.1.1w  11 Sep 2023)
AES:              OpenSSL
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  OQS_DIST_BUILD OQS_LIBJADE_BUILD OQS_OPT_TARGET=generic CMAKE_BUILD_TYPE=Release 
CPU exts active:  AES SHA2 SHA3 NEON
Speed test
==========
Started at 2024-11-19 12:23:29
Operation                            | Iterations | Total time (s) | Time (us): mean | pop. stdev | High-prec time (ns): mean | pop. stdev
------------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
LMS_SHA256_H5_W8_H5_W8               |            |                |                 |            |                           |           
keypair                              |       1574 |         49.030 |       31150.048 |    632.895 |                  31150021 |     632896
sign                                 |        487 |         49.061 |      100741.156 |   9999.512 |                 100741132 |    9999504
verify                               |      55355 |         49.001 |         885.209 |     15.356 |                    885175 |      15356