open-quantum-safe / oqs-provider

OpenSSL 3 provider containing post-quantum algorithms
https://openquantumsafe.org
MIT License
210 stars 86 forks source link

Unable to run tests on OSX #132

Closed mouse07410 closed 1 year ago

mouse07410 commented 1 year ago

Updated

To make some details clear, as previous overly-generic description invited useless overly-generic observations, like "could not replicate on Windows".

Describe the bug ctest crashes with SIGTRAP.

To Reproduce Steps to reproduce the behavior:

  1. Build or get installed OpenSSL system-wide, e.g., in /opt/local/libexec/openssl3. For this test I used OpenSSL-3.1.0, and used Macports to get the binary installed.
  2. Build and install liboqs system-wide. I used liboqs master, and installed it in opt/local: /opt/local/lib for the shared library, /opt/local/include for the header files.
  3. Clone, build, and install this provider (don't forget to edit openssl.cnf as appropriate).
  4. export OPENSSL_APP=/opt/local/libexec/openssl3/bin/openssl, export OPENSSL_MODULES=/opt/local/libexec/openssl3/lib/ossl-modules
  5. Optional? To make the environment closer to mine, install pkcs11-provider and GOST engine system-wide, and adjust openssl.cnf to point at them.
  6. Further complication Install oqs-provider and make it available system-wide by adding it to openssl.cnf.
  7. Go to _build and do ctest --output-on-failure
  8. Observe the error report.

Expected behavior Tests passing.

Screenshots

Screenshot 2023-03-27 at 3 49 24 PM

Crash report

Translated Report (Full Report Below)
-------------------------------------

Process:               oqs_test_kems [17508]
Path:                  /Users/USER/*/oqs_test_kems
Identifier:            oqs_test_kems
Version:               ???
Code Type:             ARM-64 (Native)
Parent Process:        ctest [17500]
Responsible:           Terminal [983]
User ID:               501

Date/Time:             2023-03-27 15:39:40.6519 -0400
OS Version:            macOS 13.2.1 (22D68)
Report Version:        12
Anonymous UUID:        161C054B-E964-CDD3-5EBC-5A9DBE3E2AE2

Time Awake Since Boot: 66000 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BREAKPOINT (SIGTRAP)
Exception Codes:       0x0000000000000001, 0x00000001a6283108

Termination Reason:    Namespace SIGNAL, Code 5 Trace/BPT trap: 5
Terminating Process:   exc handler [17508]

Application Specific Information:
BUG IN CLIENT OF LIBPLATFORM: Trying to recursively lock an os_once_t
Abort Cause 259

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_platform.dylib               0x1a6283108 _os_once_gate_recursive_abort + 36
1   libsystem_platform.dylib               0x1a627f710 _os_once_gate_wait + 348
2   libsystem_pthread.dylib                0x1a624dd84 pthread_once + 100
3   libcrypto.3.dylib                      0x1012bdfcc CRYPTO_THREAD_run_once + 12
4   libcrypto.3.dylib                      0x1012d0ea4 ossl_obj_add_object + 236
5   gostprov.dylib                         0x1015bf5b4 populate_gost_engine + 116
6   gostprov.dylib                         0x1015bd478 OSSL_provider_init + 116
7   libcrypto.3.dylib                      0x1012bbddc provider_activate + 260
8   libcrypto.3.dylib                      0x1012bbc48 ossl_provider_activate + 56
9   libcrypto.3.dylib                      0x1012ba93c provider_conf_init + 608
10  libcrypto.3.dylib                      0x101212c4c CONF_modules_load + 856
11  libcrypto.3.dylib                      0x101212ee8 CONF_modules_load_file_ex + 120
12  libcrypto.3.dylib                      0x101213738 ossl_config_int + 68
13  libcrypto.3.dylib                      0x1012b2400 ossl_init_config_ossl_ + 16
14  libsystem_pthread.dylib                0x1a624ddec __pthread_once_handler + 76
15  libsystem_platform.dylib               0x1a627d7e0 _os_once_callout + 32
16  libsystem_pthread.dylib                0x1a624dd84 pthread_once + 100
17  libcrypto.3.dylib                      0x1012bdfcc CRYPTO_THREAD_run_once + 12
18  libcrypto.3.dylib                      0x1012b2208 OPENSSL_init_crypto + 1104
19  libcrypto.3.dylib                      0x1012d1098 obj_lock_initialise_ossl_ + 20
20  libsystem_pthread.dylib                0x1a624ddec __pthread_once_handler + 76
21  libsystem_platform.dylib               0x1a627d7e0 _os_once_callout + 32
22  libsystem_pthread.dylib                0x1a624dd84 pthread_once + 100
23  libcrypto.3.dylib                      0x1012bdfcc CRYPTO_THREAD_run_once + 12
24  libcrypto.3.dylib                      0x1012d0408 OBJ_sn2nid + 112
25  libcrypto.3.dylib                      0x1012d02f4 OBJ_txt2obj + 216
26  libcrypto.3.dylib                      0x1012d0944 OBJ_txt2nid + 20
27  libcrypto.3.dylib                      0x1012bd048 core_obj_create + 36
28  oqsprovider.0.5.0-dev.dylib            0x100f73678 OSSL_provider_init + 292
29  libcrypto.3.dylib                      0x1012bbddc provider_activate + 260
30  libcrypto.3.dylib                      0x1012bbc48 ossl_provider_activate + 56
31  libcrypto.3.dylib                      0x1012ba93c provider_conf_init + 608
32  libcrypto.3.dylib                      0x101212c4c CONF_modules_load + 856
33  libcrypto.3.dylib                      0x101212ee8 CONF_modules_load_file_ex + 120
34  libcrypto.3.dylib                      0x1012af1a4 OSSL_LIB_CTX_load_config + 20
35  oqs_test_kems                          0x100de7420 main + 80
36  dyld                                   0x1a5f27e50 start + 2544

Thread 0 crashed with ARM Thread State (64-bit):
    x0: 0x0000000000000103   x1: 0x000000016f019b90   x2: 0x00000001a624dda0   x3: 0x0000000000000103
    x4: 0x000000000000000a   x5: 0x0000000024200000   x6: 0x0000000000000000   x7: 0x0000000000000500
    x8: 0x0000000000000103   x9: 0x0000000000000103  x10: 0x0000000000000103  x11: 0x0000600000fb8000
   x12: 0x0000000000000010  x13: 0x00000000fffffcee  x14: 0x00000000000007fb  x15: 0x00000000a4188ffb
   x16: 0x00000001a627d760  x17: 0x00000002066400a0  x18: 0x0000000000000000  x19: 0x0000000101438d58
   x20: 0x0000000000000103  x21: 0x00000001a624dda0  x22: 0x000000016f019b90  x23: 0x0000000000000103
   x24: 0x0000000000000103  x25: 0x0000000000000000  x26: 0x0000000000000002  x27: 0x0000000000000002
   x28: 0x0000600000fa4000   fp: 0x000000016f019b80   lr: 0x00000001a627f710
    sp: 0x000000016f019b50   pc: 0x00000001a6283108 cpsr: 0x60001000
   far: 0x00000001ff2bc0b8  esr: 0xf2000001 (Breakpoint) brk 1

Environment (please complete the following information):

Additional context This is on MacBook Pro - Apple Silicon M2 chip. Similar problem on Intel-based iMac (used same process as above).

Note: commenting out, e.g., GOST provider in openssl.cnf did not help.

$ openssl version
OpenSSL 3.1.0 14 Mar 2023 (Library: OpenSSL 3.1.0 14 Mar 2023)
$ openssl list -providers
Providers:
  base
    name: OpenSSL Base Provider
    version: 3.1.0
    status: active
  default
    name: OpenSSL Default Provider
    version: 3.1.0
    status: active
  gost
    name: OpenSSL GOST Provider
    status: active
  legacy
    name: OpenSSL Legacy Provider
    version: 3.1.0
    status: active
  oqs
    name: OpenSSL OQS Provider
    version: 0.5.0-dev
    status: active
  pkcs11
    name: PKCS#11 Provider
    version: 3.1.0
    status: active
$ 
mouse07410 commented 1 year ago

Here's my patch for the scripts, which includes addressing some of the SHellCheck warnings:

diff --git a/scripts/oqsprovider-ca.sh b/scripts/oqsprovider-ca.sh
index 1de9b10..68e6400 100755
--- a/scripts/oqsprovider-ca.sh
+++ b/scripts/oqsprovider-ca.sh
@@ -17,11 +17,6 @@ if [ -z "$OPENSSL_MODULES" ]; then
     exit 1
 fi

-if [ -z "$LD_LIBRARY_PATH" ]; then
-    echo "LD_LIBRARY_PATH env var not set. Exiting."
-    exit 1
-fi
-
 #rm -rf tmp
 mkdir -p tmp && cd tmp
 rm -rf demoCA && mkdir -p demoCA/newcerts
diff --git a/scripts/oqsprovider-certgen.sh b/scripts/oqsprovider-certgen.sh
index c4d0907..d6c8b06 100755
--- a/scripts/oqsprovider-certgen.sh
+++ b/scripts/oqsprovider-certgen.sh
@@ -17,12 +17,7 @@ if [ -z "$OPENSSL_MODULES" ]; then
     exit 1
 fi

-if [ -z "$LD_LIBRARY_PATH" ]; then
-    echo "LD_LIBRARY_PATH env var not set. Exiting."
-    exit 1
-fi
-
-#rm -rf tmp
+rm -rf tmp/*
 mkdir -p tmp
 $OPENSSL_APP req -x509 -new -newkey $1 -keyout tmp/$1_CA.key -out tmp/$1_CA.crt -nodes -subj "/CN=oqstest CA" -days 365 -provider oqsprovider -provider default && \
 $OPENSSL_APP genpkey -algorithm $1 -out tmp/$1_srv.key -provider oqsprovider -provider default && \
diff --git a/scripts/oqsprovider-certverify.sh b/scripts/oqsprovider-certverify.sh
index 0d571ce..665181f 100755
--- a/scripts/oqsprovider-certverify.sh
+++ b/scripts/oqsprovider-certverify.sh
@@ -17,11 +17,6 @@ if [ -z "$OPENSSL_MODULES" ]; then
     exit 1
 fi

-if [ -z "$LD_LIBRARY_PATH" ]; then
-    echo "LD_LIBRARY_PATH env var not set. Exiting."
-    exit 1
-fi
-
 # check that CSR can be output OK

 $OPENSSL_APP req -text -in tmp/$1_srv.csr -noout -provider oqsprovider -provider default 2>&1 | grep Error
diff --git a/scripts/oqsprovider-cmssign.sh b/scripts/oqsprovider-cmssign.sh
index 2408dd3..f979903 100755
--- a/scripts/oqsprovider-cmssign.sh
+++ b/scripts/oqsprovider-cmssign.sh
@@ -28,11 +28,6 @@ if [ -z "$OPENSSL_MODULES" ]; then
     exit 1
 fi

-if [ -z "$LD_LIBRARY_PATH" ]; then
-    echo "LD_LIBRARY_PATH env var not set. Exiting."
-    exit 1
-fi
-
 # Assumes certgen has been run before: Quick check

 if [ -f tmp/$1_CA.crt ]; then
diff --git a/scripts/oqsprovider-cmsverify.sh b/scripts/oqsprovider-cmsverify.sh
index 85d2935..c13531b 100755
--- a/scripts/oqsprovider-cmsverify.sh
+++ b/scripts/oqsprovider-cmsverify.sh
@@ -21,11 +21,6 @@ if [ -z "$OPENSSL_MODULES" ]; then
     exit 1
 fi

-if [ -z "$LD_LIBRARY_PATH" ]; then
-    echo "LD_LIBRARY_PATH env var not set. Exiting."
-    exit 1
-fi
-
 openssl_version=$($OPENSSL_APP version)

 if [[ "$openssl_version" == "OpenSSL 3.0."* ]]; then
diff --git a/scripts/runtests.sh b/scripts/runtests.sh
index f360d0e..570bb58 100755
--- a/scripts/runtests.sh
+++ b/scripts/runtests.sh
@@ -5,17 +5,17 @@ rv=0
 provider2openssl() {
     echo
     echo "Testing oqsprovider->oqs-openssl interop for $1:"
-    $OQS_PROVIDER_TESTSCRIPTS/oqsprovider-certgen.sh $1 && $OQS_PROVIDER_TESTSCRIPTS/oqsprovider-cmssign.sh $1 sha3-384 && $OQS_PROVIDER_TESTSCRIPTS/oqs-openssl-certverify.sh $1 && $OQS_PROVIDER_TESTSCRIPTS/oqs-openssl-cmsverify.sh $1
+    "$OQS_PROVIDER_TESTSCRIPTS"/oqsprovider-certgen.sh $1 && "$OQS_PROVIDER_TESTSCRIPTS"/oqsprovider-cmssign.sh $1 sha3-384 && "$OQS_PROVIDER_TESTSCRIPTS"/oqs-openssl-certverify.sh $1 && "$OQS_PROVIDER_TESTSCRIPTS"/oqs-openssl-cmsverify.sh $1
 }

 openssl2provider() {
     echo
     echo "Testing oqs-openssl->oqsprovider interop for $1:"
-    $OQS_PROVIDER_TESTSCRIPTS/oqs-openssl-certgen.sh $1 && $OQS_PROVIDER_TESTSCRIPTS/oqs-openssl-cmssign.sh $1 && $OQS_PROVIDER_TESTSCRIPTS/oqsprovider-certverify.sh $1 && $OQS_PROVIDER_TESTSCRIPTS/oqsprovider-cmsverify.sh $1
+    "$OQS_PROVIDER_TESTSCRIPTS"/oqs-openssl-certgen.sh $1 && "$OQS_PROVIDER_TESTSCRIPTS"/oqs-openssl-cmssign.sh $1 && "$OQS_PROVIDER_TESTSCRIPTS"/oqsprovider-certverify.sh $1 && "$OQS_PROVIDER_TESTSCRIPTS"/oqsprovider-cmsverify.sh $1
 }

 localalgtest() {
-    $OQS_PROVIDER_TESTSCRIPTS/oqsprovider-certgen.sh $1 >> interop.log 2>&1 && $OQS_PROVIDER_TESTSCRIPTS/oqsprovider-certverify.sh $1 >> interop.log 2>&1 && $OQS_PROVIDER_TESTSCRIPTS/oqsprovider-cmssign.sh $1 >> interop.log 2>&1 &&  $OQS_PROVIDER_TESTSCRIPTS/oqsprovider-ca.sh $1 >> interop.log 2>&1
+    "$OQS_PROVIDER_TESTSCRIPTS"/oqsprovider-certgen.sh $1 >> interop.log 2>&1 && "$OQS_PROVIDER_TESTSCRIPTS"/oqsprovider-certverify.sh $1 >> interop.log 2>&1 && "$OQS_PROVIDER_TESTSCRIPTS"/oqsprovider-cmssign.sh $1 >> interop.log 2>&1 &&  "$OQS_PROVIDER_TESTSCRIPTS"/oqsprovider-ca.sh $1 >> interop.log 2>&1
     if [ $? -ne 0 ]; then
         echo "localalgtest $1 failed. Exiting.".
         cat interop.log
@@ -27,7 +27,7 @@ interop() {
     echo ".\c"
     # check if we want to run this algorithm:
     if [ ! -z "$OQS_SKIP_TESTS" ]; then
-        GREPTEST=$(echo $OQS_SKIP_TESTS | sed "s/\,/\\\|/g")
+        GREPTEST=$(echo "$OQS_SKIP_TESTS" | sed "s/\,/\\\|/g")
         if echo $1 | grep -q "$GREPTEST"; then
             echo "Not testing $1" >> interop.log
             return
@@ -35,7 +35,7 @@ interop() {
     fi

     # Check whether algorithm is supported at all:
-    $OPENSSL_APP list -signature-algorithms -provider oqsprovider | grep $1 > /dev/null 2>&1
+    "$OPENSSL_APP" list -signature-algorithms -provider oqsprovider | grep $1 > /dev/null 2>&1
     if [ $? -ne 1 ]; then
    if [ -z "$LOCALTESTONLY" ]; then
             provider2openssl $1 >> interop.log 2>&1 && openssl2provider $1 >> interop.log 2>&1
@@ -57,24 +57,24 @@ fi

 if [ ! -z "$OPENSSL_INSTALL" ]; then
     # trying to set config variables suitably for pre-existing OpenSSL installation
-    if [ -f $OPENSSL_INSTALL/bin/openssl ]; then
-        export OPENSSL_APP=$OPENSSL_INSTALL/bin/openssl
+    if [ -f "$OPENSSL_INSTALL"/bin/openssl ] && [ -z "$OPENSSL_APP" ]; then
+        export OPENSSL_APP="$OPENSSL_INSTALL"/bin/openssl
     fi
-    if [ -d $OPENSSL_INSTALL/lib64 ]; then
-        export LD_LIBRARY_PATH=$OPENSSL_INSTALL/lib64
+    if [ -d "$OPENSSL_INSTALL"/lib64 ]; then
+        export LD_LIBRARY_PATH="$OPENSSL_INSTALL"/lib64
     fi
     if [ -f $OPENSSL_INSTALL/ssl/openssl.cnf ]; then
-        export OPENSSL_CONF=$OPENSSL_INSTALL/ssl/openssl.cnf
+        export OPENSSL_CONF="$OPENSSL_INSTALL"/ssl/openssl.cnf
     fi
 else
     if [ -z "$OPENSSL_CONF" ]; then
-        export OPENSSL_CONF=$(pwd)/scripts/openssl-ca.cnf
+        export OPENSSL_CONF="$(pwd)/scripts/openssl-ca.cnf"
     fi
 fi

 if [ -z "$OPENSSL_APP" ]; then
     if [ -f $(pwd)/openssl/apps/openssl ]; then
-        export OPENSSL_APP=$(pwd)/openssl/apps/openssl
+        export OPENSSL_APP="$(pwd)/openssl/apps/openssl"
     else # if no local openssl src directory is found, rely on PATH...
         export OPENSSL_APP=openssl
     fi
@@ -84,8 +84,12 @@ if [ -z "$OPENSSL_MODULES" ]; then
     export OPENSSL_MODULES=$(pwd)/_build/lib
 fi

-if [ -z "$LD_LIBRARY_PATH" ]; then
-    export LD_LIBRARY_PATH=$(pwd)/.local/lib64
+if [ "$OSTYPE" == "darwin"* ]; then
+    export LD_LIBRARY_PATH="/opt/local/lib:/usr/local/lib:"
+else
+    if [ -z "$LD_LIBRARY_PATH" ]; then
+        export LD_LIBRARY_PATH=$(pwd)/.local/lib64
+    fi
 fi

 if [ ! -z "$OQS_SKIP_TESTS" ]; then
@@ -159,7 +163,7 @@ echo
 # Run built-in tests:
 # Without removing OPENSSL_CONF ctest hangs... ???
 unset OPENSSL_CONF
-cd _build && ctest $@ && cd ..
+cd _build && ctest "$@" && cd ..

 if [ $? -ne 0 ]; then
    rv=1
mouse07410 commented 1 year ago

And here's my script to drive build and test for system-wide binary OpenSSL and local sources of OpenSSL master (dev):

#!/bin/bash

# Cleaning up previous builds
make clean
rm -rf tmp/*
rm -f interop.log interop-3.log

# Set env var - flags
OQSPROV=1
OQSKM=1
OQSKEY=1

unset OPENSSL_INSTALL

# Build for local sources of master branch of OpenSSL-3.2+
if [ -d $HOME/openssl-3 ]; then
    LD_LIBRARY_PATH="$HOME/openssl-3/lib:/usr/local/lib:"
    OPENSSL_ROOT_DIR="$HOME/openssl-3"
    OPENSSL_DIR="$OPENSSL_ROOT_DIR"
    OPENSSL_INSTALL="$OPENSSL_DIR"
    OPENSSL_APP="$OPENSSL_ROOT_DIR/bin/openssl"
    OPENSSL="$OPENSSL_APP"
    OPENSSL_CONF="$OPENSSL_ROOT_DIR/etc/openssl.cnf"
    OPENSSL_MODULES="$OPENSSL_ROOT_DIR/lib/ossl-modules"
    OPENSSL_LIB_DIR="$OPENSSL_ROOT_DIR/lib"
    OPENSSL_INCLUDE_DIR="$OPENSSL_ROOT_DIR/include"
    echo "Building for source-based OpenSSL-3.2.x-dev..."
    env | grep OPENSSL > build-out-s.txt
    echo "" >> build-out-s.txt
    cmake -DCMAKE_BUILD_TYPE=Debug -DOPENSSL_ROOT_DIR="$HOME/src/openssl" -DCMAKE_C_FLAGS="$CFLAGS -I/opt/local/include -L/opt/local/lib" -DCMAKE_VERBOSE_MAKEFILE:BOOL=True -S . -B _build 2>&1 | tee -a build-out-s.txt
    cmake --build _build 2>&1 | tee -a build-out-s.txt
    if [ -x _build/lib/oqsprovider.0.5.0-dev.dylib ]; then
        echo "Successful build for source-based OpenSSL"
        scripts/runtests.sh 2>&1 | tee tests-out-s.txt
    else
        echo "Apparently, building for source-based OpenSSL-3.2.x-dev failed"
        echo ""
    fi
else
    echo ""
    echo "Sources of OpenSSL-3.2.x-dev not found, skipping..."
    echo ""
fi

# Build for Macports-installed binaries of OpenSSL-3.+
if [ -d /opt/local/libexec/openssl3 ]; then
    LD_LIBRARY_PATH="/opt/local/lib:/usr/local/lib:"
    OPENSSL_ROOT_DIR="/opt/local/libexec/openssl3"
    OPENSSL_DIR="$OPENSSL_ROOT_DIR"
    OPENSSL_INSTALL="$OPENSSL_DIR"
    OPENSSL_APP="$OPENSSL_ROOT_DIR/bin/openssl"
    OPENSSL="$OPENSSL_APP"
    OPENSSL_CONF="$OPENSSL_ROOT_DIR/etc/openssl/openssl.cnf"
    OPENSSL_MODULES="$OPENSSL_ROOT_DIR/lib/ossl-modules"
    OPENSSL_LIB_DIR="$OPENSSL_ROOT_DIR/lib"
    OPENSSL_INCLUDE_DIR="$OPENSSL_ROOT_DIR/include"
    env | grep OPENSSL > build-out.txt
    echo "" >> build-out.txt
    echo "Building for Macports-installed OpenSSL-3..."
    cmake -DCMAKE_BUILD_TYPE=Debug -DOPENSSL_ROOT_DIR="/opt/local/libexec/openssl3" -DCMAKE_C_FLAGS="$CFLAGS -I/opt/local/include -L/opt/local/lib" -DCMAKE_VERBOSE_MAKEFILE:BOOL=True -S . -B _build 2>&1 | tee -a build-out.txt
    echo "" >> build-out.txt
    cmake --build _build 2>&1 | tee -a build-out.txt
    if [ -x _build/lib/oqsprovider.0.5.0-dev.dylib ]; then
        echo "Successful build for Macports-installed OpenSSL"
        scripts/runtests.sh 2>&1 | tee tests-out.txt
    else
        echo "Apparently, building for Macports-installed OpenSSL-3 failed"
        echo ""
    fi  
else
    echo ""
    echo "Macports-installed OpenSSL-3 not found, skipping..."
    echo ""
fi

exit 0
baentsch commented 1 year ago

Here's my patch for the scripts, which includes addressing some of the SHellCheck warnings:

Thanks for those proposals. Would you want to do this as a PR (it's your contribution, really) or do you want me to just copy those changes into #140? If the latter, I'd probably add the second script as an OSX-only script.

mouse07410 commented 1 year ago

do you want me to just copy those changes into https://github.com/open-quantum-safe/oqs-provider/pull/140?

Yes please.

If the latter, I'd probably add the second script as an OSX-only script.

I can probably make it usable on MacOS and Linux. Main (only?) difference would be where the OpenSSL is installed.

mingw-io commented 1 year ago

It looks like Mac has become like Windows! DLL HELL!

We have been unable to repro this issue on Windows. We suspect this could be a broken/polluted (user) environment issue.

We will continue with our testing and report any findings.

mouse07410 commented 1 year ago

It looks like Mac has become like Windows! DLL HELL!

Thankfully, not even close. But written-for-Linux test scripts have difficulty locating shared libraries located in various not-necessarily-expected places.

We have been unable to repro this issue on Windows.

First, I'm all shocked that different OS may show different behaviors and problems. Second, I'm not sure at all that you replicated this issue exactly as described above.

We suspect this could be a broken/polluted (user) environment issue.

Thanks for this very useful observation. It appears that "broken user environment" in this context means system-wide installation of OpenSSL and of LIBOQS, both of which were not installed by this provider's scripts, and available to this process only as binaries. Compounded by the presence of other providers (e.g., PKCS11) and engines (e.g., GOST). Did you bother to install all of those, system-wide? If not - your "repro" itself is a big suspect.

We will continue with our testing and report any findings.

If you wish. I think we (thanks, @baentsch !) figured out already most everything - except why (some of the) tests fail when system-wide oqs-provider is available and listed in openssl.cnf. And the remaining problem with the GOST engine (or provider) - but again, it doesn't look like you'd be helpful there.

mouse07410 commented 1 year ago

@baentsch any idea why tests fail if the main (system-wide) openssl.cnf points at a working oqsprovider.dylib?

It's not a show-stopper - just a considerable inconvenience, being forced to edit openssl.cnf whenever I need to update or re-test OQS provider. Thanks!

mouse07410 commented 1 year ago

The problem seems to be related to multiple calls to something supposed to be called only once, and this time it's not related to GOST provider or engine (engines and GOST are disabled):

Application Specific Information:
BUG IN CLIENT OF LIBPLATFORM: Trying to recursively lock an os_once_t
Abort Cause 259

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_platform.dylib               0x19f5d5260 _os_once_gate_recursive_abort + 36
1   libsystem_platform.dylib               0x19f5d1ed8 _os_once_gate_wait + 348
2   libsystem_pthread.dylib                0x19f59fcf8 pthread_once + 100
3   libcrypto.3.dylib                      0x101111fcc CRYPTO_THREAD_run_once + 12
4   libcrypto.3.dylib                      0x101124408 OBJ_sn2nid + 112
5   libcrypto.3.dylib                      0x1011242f4 OBJ_txt2obj + 216
6   libcrypto.3.dylib                      0x101124944 OBJ_txt2nid + 20
7   libcrypto.3.dylib                      0x101111048 core_obj_create + 36
8   oqsprovider.0.5.0-dev.dylib            0x101403bbc OSSL_provider_init + 280 (oqsprov.c:620)
9   libcrypto.3.dylib                      0x10110fddc provider_activate + 260
10  libcrypto.3.dylib                      0x10110fc48 ossl_provider_activate + 56
11  libcrypto.3.dylib                      0x10110e93c provider_conf_init + 608
12  libcrypto.3.dylib                      0x101066c4c CONF_modules_load + 856
13  libcrypto.3.dylib                      0x101066ee8 CONF_modules_load_file_ex + 120
14  libcrypto.3.dylib                      0x101067738 ossl_config_int + 68
15  libcrypto.3.dylib                      0x101106400 ossl_init_config_ossl_ + 16
16  libsystem_pthread.dylib                0x19f59fd60 __pthread_once_handler + 76
17  libsystem_platform.dylib               0x19f5cffa0 _os_once_callout + 32
18  libsystem_pthread.dylib                0x19f59fcf8 pthread_once + 100
19  libcrypto.3.dylib                      0x101111fcc CRYPTO_THREAD_run_once + 12
20  libcrypto.3.dylib                      0x101106208 OPENSSL_init_crypto + 1104
21  libcrypto.3.dylib                      0x101125098 obj_lock_initialise_ossl_ + 20
22  libsystem_pthread.dylib                0x19f59fd60 __pthread_once_handler + 76
23  libsystem_platform.dylib               0x19f5cffa0 _os_once_callout + 32
24  libsystem_pthread.dylib                0x19f59fcf8 pthread_once + 100
25  libcrypto.3.dylib                      0x101111fcc CRYPTO_THREAD_run_once + 12
26  libcrypto.3.dylib                      0x101124408 OBJ_sn2nid + 112
27  libcrypto.3.dylib                      0x1011242f4 OBJ_txt2obj + 216
28  libcrypto.3.dylib                      0x101124944 OBJ_txt2nid + 20
29  libcrypto.3.dylib                      0x101111048 core_obj_create + 36
30  oqsprovider.0.5.0-dev.dylib            0x100dffbbc OSSL_provider_init + 280 (oqsprov.c:620)
31  libcrypto.3.dylib                      0x10110fddc provider_activate + 260
32  libcrypto.3.dylib                      0x10110fc48 ossl_provider_activate + 56
33  libcrypto.3.dylib                      0x10110e93c provider_conf_init + 608
34  libcrypto.3.dylib                      0x101066c4c CONF_modules_load + 856
35  libcrypto.3.dylib                      0x101066ee8 CONF_modules_load_file_ex + 120
36  libcrypto.3.dylib                      0x1011031a4 OSSL_LIB_CTX_load_config + 20
37  oqs_test_kems                          0x100c772e0 main + 84 (oqs_test_kems.c:153)
38  dyld                                   0x19f24bf28 start + 2236

@levitte do you have an opinion here?

mattcaswell commented 1 year ago

This looks like a different instance of the same problem as fixed by openssl/openssl#20662.

In your callstack above you can see these lines which were the cause of the original problem (i.e. calling OPENSSL_init_crypto from obj_lock_initialise):

20 libcrypto.3.dylib 0x101106208 OPENSSL_init_crypto + 1104 21 libcrypto.3.dylib 0x101125098 obj_lock_initialiseossl + 20

mouse07410 commented 1 year ago

This looks like a different instance of the same problem as fixed by https://github.com/openssl/openssl/pull/20662.

Yes, probably - but (a) where in your opinion the root cause is (aka, what component issues those improper calls, and why), and (b) how do we fix it, and where (in what component)?

Also, it looks like the fix was merged two-three weeks ago into 3.1, so should've been picked by Macports by now? I'm trying to understand why I don't see the behavior change yet on my machines...

mattcaswell commented 1 year ago

Yes, probably - but (a) where in your opinion the root cause is (aka, what component issues those improper calls, and why), and (b) how do we fix it, and where (in what component)?

This is a bug in the OpenSSL library, not in the config or providers.

Also, it looks like the fix was merged two-three weeks ago into 3.1, so should've been picked by Macports by now? I'm trying to understand why I don't see the behavior change yet on my machines...

Because its only in git, not in a stable release yet. When 3.1.1 eventually gets released the fix will be included.

mouse07410 commented 1 year ago

@mattcaswell I seem to have this problem with the OpenSSL-3.2.0-dev built from source/master as well. Which presumably has the fix merged...

Update - adding info

Test setup:
LD_LIBRARY_PATH=/Users/ur20980/src/oqs-provider/.local/lib64
OPENSSL_APP=/Users/ur20980/openssl-3/bin/openssl
OPENSSL_CONF=/Users/ur20980/openssl-3/etc/openssl.cnf
OPENSSL_MODULES=/Users/ur20980/openssl-3/lib/ossl-modules
No OQS-OpenSSL111 interop test because of absence of docker
Version information:
OpenSSL 3.2.0-dev  (Library: OpenSSL 3.2.0-dev )
Providers:
  base
    name: OpenSSL Base Provider
    version: 3.2.0
    status: active
    build info: 3.2.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  default
    name: OpenSSL Default Provider
    version: 3.2.0
    status: active
    build info: 3.2.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  oqs
    name: OpenSSL OQS Provider
    version: 0.5.0-dev
    status: active
    build info: OQS Provider v.0.5.0-dev (27d33d2) based on liboqs v.0.8.0-dev using qsc-key-encoder v.draft-00-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  oqsprovider
    name: OpenSSL OQS Provider
    version: 0.5.0-dev
    status: active
    build info: OQS Provider v.0.5.0-dev (fbd2538) based on liboqs v.0.8.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  pkcs11
    name: PKCS#11 Provider
    version: 3.2.0
    status: active
    build info: 3.2.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
Cert gen/verify, CMS sign/verify, CA tests for all enabled algorithms commencing...
.localalgtest dilithium2 failed. Exiting..
-----
-----
Warning: CSR self-signature does not match the contentsCertificate request self-signature did not match the contents
40E38760F87F0000:error:4000000D:pkcs11:oqs_sig_verify:reason(13):/Users/ur20980/src/oqs-provider/oqsprov/oqs_sig.c:400:
40E38760F87F0000:error:06880006:asn1 encoding routines:ASN1_item_verify_ctx:EVP lib:crypto/asn1/a_verify.c:215:
40E38760F87F0000:error:4000000D:pkcs11:oqs_sig_verify:reason(13):/Users/ur20980/src/oqs-provider/oqsprov/oqs_sig.c:400:
40E38760F87F0000:error:06880006:asn1 encoding routines:ASN1_item_verify_ctx:EVP lib:crypto/asn1/a_verify.c:215:
$ 

interop.log.txt

mouse07410 commented 1 year ago

Again, with the current (as of today) OpenSSL master, tests still fail if openssl.cnf already has oqs-provider defined.

baentsch commented 1 year ago

Why shouldn't I be able to build a provider in debug mode without OpenSSL source available?

FYI, this oqsprovider limitation is gone as of today (resolving https://github.com/open-quantum-safe/oqs-provider/issues/137)

Again, with the current (as of today) OpenSSL master, tests still fail if openssl.cnf already has oqs-provider defined.

Does it also happen if pkcs11 provider is not active? Just wondering whether that may deliver an "unexpected" algorithm implementation to oqsprovider...

mouse07410 commented 1 year ago

Does it also happen if pkcs11 provider is not active? Just wondering whether that may deliver an "unexpected" algorithm implementation to oqsprovider...

Yes, same thing, same behavior.

baentsch commented 1 year ago

Yes, same thing, same behavior.

So all you need is (one) oqsprovider and default provider active and the error above happens? Just asking as the listing above shows two active oqsprovider instances...

If you can still reproduce, would you mind building oqsprovider as "Debug" (in "main" branch should now be possible without OpenSSL dependency) and set env vars OQSSIG=1 and OQSKM=1 when running the test again and sharing the log output?

mouse07410 commented 1 year ago

the listing above shows two active oqsprovider instances...

Exactly! One is what's been already installed and configured in openssl.cnf, and the other one is the provider being just built and tested. I believe the "pre-existing one" is listed as oqs, and the one just-built (not installed yet) that's presumably being tested is listed as oqsprovider.

If you can still reproduce,

Alas, I can, easily. :-) :-(

would you mind building oqsprovider as "Debug" (in "main" branch should now be possible without OpenSSL dependency) and set env vars OQSSIG=1 and OQSKM=1 when running the test again and sharing the log output?

All that is already done.

Here's the screen for OpenSSL-3.2.0-dev (failing):

.  .  .
Test setup:
LD_LIBRARY_PATH=/Users/ur20980/src/oqs-provider/.local/lib64
OPENSSL_APP=/Users/ur20980/openssl-3/bin/openssl
OPENSSL_CONF=/Users/ur20980/openssl-3/etc/openssl.cnf
OPENSSL_MODULES=/Users/ur20980/openssl-3/lib/ossl-modules
No OQS-OpenSSL111 interop test because of absence of docker
Version information:
OpenSSL 3.2.0-dev  (Library: OpenSSL 3.2.0-dev )
Providers:
  base
    name: OpenSSL Base Provider
    version: 3.2.0
    status: active
    build info: 3.2.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  default
    name: OpenSSL Default Provider
    version: 3.2.0
    status: active
    build info: 3.2.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  oqs
    name: OpenSSL OQS Provider
    version: 0.5.0-dev
    status: active
    build info: OQS Provider v.0.5.0-dev (12a6418) based on liboqs v.0.8.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  oqsprovider
    name: OpenSSL OQS Provider
    version: 0.5.0-dev
    status: active
    build info: OQS Provider v.0.5.0-dev (12a6418) based on liboqs v.0.8.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
Cert gen/verify, CMS sign/verify, CA tests for all enabled algorithms commencing...
.localalgtest dilithium2 failed. Exiting..
-----
-----
Warning: CSR self-signature does not match the contentsCertificate request self-signature did not match the contents
4083FE52F87F0000:error:4000000D:lib(128):oqs_sig_verify:reason(13):/Users/ur20980/src/oqs-provider/oqsprov/oqs_sig.c:400:
4083FE52F87F0000:error:06880006:asn1 encoding routines:ASN1_item_verify_ctx:EVP lib:crypto/asn1/a_verify.c:215:
4083FE52F87F0000:error:4000000D:lib(128):oqs_sig_verify:reason(13):/Users/ur20980/src/oqs-provider/oqsprov/oqs_sig.c:400:
4083FE52F87F0000:error:06880006:asn1 encoding routines:ASN1_item_verify_ctx:EVP lib:crypto/asn1/a_verify.c:215:

and for OpenSSL-3.1.0, installed system-wide. OQS provider is present in openssl.cnf, but (a) tests are succeeding, and (b) this already-installed provider is not listed below! Only the oqs-provider that's being tested is shown:

Test setup:
LD_LIBRARY_PATH=/Users/ur20980/src/oqs-provider/.local/lib64
OPENSSL_APP=/opt/local/libexec/openssl3/bin/openssl
OPENSSL_CONF=/opt/local/libexec/openssl3/etc/openssl/openssl.cnf
OPENSSL_MODULES=/opt/local/libexec/openssl3/lib/ossl-modules
No OQS-OpenSSL111 interop test because of absence of docker
Version information:
OpenSSL 3.1.0 14 Mar 2023 (Library: OpenSSL 3.1.0 14 Mar 2023)
Providers:
  base
    name: OpenSSL Base Provider
    version: 3.1.0
    status: active
    build info: 3.1.0
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  default
    name: OpenSSL Default Provider
    version: 3.1.0
    status: active
    build info: 3.1.0
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  legacy
    name: OpenSSL Legacy Provider
    version: 3.1.0
    status: active
    build info: 3.1.0
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  oqsprovider
    name: OpenSSL OQS Provider
    version: 0.5.0-dev
    status: active
    build info: OQS Provider v.0.5.0-dev (12a6418) based on liboqs v.0.8.0-dev
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
  pkcs11
    name: PKCS#11 Provider
    version: 3.1.0
    status: active
    build info: 3.1.0
    gettable provider parameters:
      name: pointer to a UTF8 encoded string (arbitrary size)
      version: pointer to a UTF8 encoded string (arbitrary size)
      buildinfo: pointer to a UTF8 encoded string (arbitrary size)
      status: integer (arbitrary size)
Cert gen/verify, CMS sign/verify, CA tests for all enabled algorithms commencing...
..................................
Test project /Users/ur20980/src/oqs-provider/_build
    Start 1: oqs_signatures
1/5 Test #1: oqs_signatures ...................   Passed    4.50 sec
    Start 2: oqs_kems
2/5 Test #2: oqs_kems .........................   Passed    0.24 sec
    Start 3: oqs_groups
3/5 Test #3: oqs_groups .......................   Passed    0.39 sec
    Start 4: oqs_tlssig
4/5 Test #4: oqs_tlssig .......................   Passed    0.01 sec
    Start 5: oqs_endecode
5/5 Test #5: oqs_endecode .....................   Passed    7.03 sec

100% tests passed, 0 tests failed out of 5

Total Test time (real) =  12.18 sec

All oqsprovider tests passed.

and files: screenlog.txt

build-out-3.2.0-dev.txt tests-out-3.2.0-dev.txt

build-out-3.1.0.txt tests-out-3.1.0.txt

baentsch commented 1 year ago

Finally finding time to work on oqsprovider again... :-/ So, thanks, @mouse07410 for the report above. I could reproduce and track in #160. If you agree this is the same (remaining) issue in this issue thread (?), let's close this and continue in #160.

mouse07410 commented 1 year ago

Sure, whatever's the easiest way for you to track it.