OpenSC / libp11

PKCS#11 wrapper library
GNU Lesser General Public License v2.1
306 stars 183 forks source link

Current master fails tests (infinite loop) #215

Closed mouse07410 closed 6 years ago

mouse07410 commented 6 years ago

After https://github.com/OpenSC/libp11/pull/214 has been merged, the current master started failing its tests. It enters a recursive loop when token initialization requires getting random bytes that in turn requires token initialization, etc. I've attached the logs here. Some excerpts:

. . . . .
/Applications/Xcode.app/Contents/Developer/usr/bin/make  check-TESTS
PASS: rsa-testpkcs11.softhsm
PASS: rsa-testfork.softhsm
PASS: rsa-testlistkeys.softhsm
FAIL: rsa-evp-sign.softhsm
PASS: ec-testfork.softhsm
FAIL: fork-change-slot.softhsm
============================================================================
Testsuite summary for libp11 0.4.8_git
============================================================================
# TOTAL: 6
# PASS:  4
# SKIP:  0
# XFAIL: 0
# FAIL:  2
# XPASS: 0
# ERROR: 0
============================================================================
See tests/test-suite.log
============================================================================
$ cat tests/test-suite.log
============================================
   libp11 0.4.8_git: tests/test-suite.log
============================================

# TOTAL: 6
# PASS:  4
# SKIP:  0
# XFAIL: 0
# FAIL:  2
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: rsa-evp-sign.softhsm
==========================

Current directory: /Users/ur20980/src/libp11/tests
Source directory: .
Output directory: output.57104
-n * Initializing smart card... 
ok
Importing
Using slot 0 with a present token (0x16ba6051)
Using slot 0 with a present token (0x16ba6051)
Using slot 0 with a present token (0x16ba6051)
Finished
***************
Listing objects
***************
Using slot 0 with a present token (0x16ba6051)
Private Key Object; RSA 
  label:      server-key
  ID:         01020304
  Usage:      decrypt, sign, unwrap
Public Key Object; RSA 2048 bits
  label:      server-key
  ID:         01020304
  Usage:      encrypt, verify, wrap
Certificate Object; type = X.509 cert
  label:      server-key
  ID:         01020304
PKCS#11: Initializing the engine
PKCS#11: Initializing the engine
. . . [last line repeats 48000 times] . . .
PKCS#11: Initializing the engine
PKCS#11: Initializing the engine
./rsa-evp-sign.softhsm: line 33: 57116 Segmentation fault: 11  ./evp-sign ctrl false "${outdir}/engines.cnf" ${PRIVATE_KEY} ${PUBLIC_KEY} ${MODULE}
Basic PKCS #11 test, using ctrl failed
FAIL rsa-evp-sign.softhsm (exit status: 1)

FAIL: fork-change-slot.softhsm
==============================

Current directory: /Users/ur20980/src/libp11/tests
Source directory: .
Output directory: output.57540
-n * Initializing smart card... 
ok
-n * Initializing smart card... 
ok
Key pair generated:
Private Key Object; RSA 
  label:      pkey
  ID:         01020304
  Usage:      decrypt, sign, unwrap
Public Key Object; RSA 1024 bits
  label:      pkey
  ID:         01020304
  Usage:      encrypt, verify, wrap
Found token (22aba925-99c8-d1ee-8db7-63862b8a7bae) with matching token label.
The token (output.57540/softhsm-testpkcs11.db/22aba925-99c8-d1ee-8db7-63862b8a7bae) has been deleted.
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
./fork-change-slot.softhsm: line 66: 57550 Segmentation fault: 11  ./fork-change-slot "pkcs11:token=token2;object=pkey;type=private;pin-value=$PIN" "${outdir}/engines.cnf" ${MODULE}
FAIL fork-change-slot.softhsm (exit status: 1)

Process sample:

Sampling process 20624 for 3 seconds with 1 millisecond of run time between samples
Sampling completed, processing symbols...
Analysis of sampling evp-sign (pid 20624) every 1 millisecond
Process:         evp-sign [20624]
Path:            /Users/ur20980/src/libp11/tests/.libs/evp-sign
Load Address:    0x10402a000
Identifier:      evp-sign
Version:         0
Code Type:       X86-64
Parent Process:  sh [20612]

Date/Time:       2018-04-04 13:25:08.525 -0400
Launch Time:     2018-04-04 13:23:37.908 -0400
OS Version:      Mac OS X 10.12.6 (16G1314)
Report Version:  7
Analysis Tool:   /usr/bin/sample
----

Call graph:
    2447 Thread_4137815   DispatchQueue_1: com.apple.main-thread  (serial)
      1829 OSSLRNG::generateRandom(ByteString&, unsigned long)  (in libsofthsm2.so) + 49  [0x1043d8ca1]
      + 1829 RAND_bytes  (in libcrypto.1.0.0.dylib) + 60  [0x10416183c]
      +   1829 engine_table_select  (in libcrypto.1.0.0.dylib) + 267  [0x1041511ab]
      +     1829 engine_unlocked_init  (in libcrypto.1.0.0.dylib) + 59  [0x1041501eb]
      +       1829 ctx_init  (in libpkcs11.dylib) + 86  [0x1043647f6]
      +         1829 ctx_init_libp11_unlocked  (in libpkcs11.dylib) + 96  [0x104364890]
      +           1829 pkcs11_CTX_load  (in libpkcs11.dylib) + 101  [0x1043699c5]
      +             1829 C_Initialize  (in libsofthsm2.so) + 25  [0x104378909]
      +               1829 SoftHSM::C_Initialize(void*)  (in libsofthsm2.so) + 994  [0x1043922e2]
      +                 1829 SlotManager::SlotManager(ObjectStore*)  (in libsofthsm2.so) + 506  [0x10440529a]
      +                   1829 SlotManager::insertToken(ObjectStore*, unsigned long, ObjectStoreToken*)  (in libsofthsm2.so) + 56  [0x1044053f8]
      +                     1829 Slot::Slot(ObjectStore*, unsigned long, ObjectStoreToken*)  (in libsofthsm2.so) + 63  [0x104405f7f]
      +                       1829 Token::Token(ObjectStoreToken*)  (in libsofthsm2.so) + 136  [0x1044066e8]
      +                         1829 SecureDataManager::SecureDataManager(ByteString const&, ByteString const&)  (in libsofthsm2.so) + 86  [0x1043e1a06]
      +                           1829 SecureDataManager::initObject()  (in libsofthsm2.so) + 114  [0x1043e16d2]
      +                             1829 OSSLRNG::generateRandom(ByteString&, unsigned long)  (in libsofthsm2.so) + 49  [0x1043d8ca1]
      +                               1829 RAND_bytes  (in libcrypto.1.0.0.dylib) + 60  [0x10416183c]
      +                                 1829 engine_table_select  (in libcrypto.1.0.0.dylib) + 267  [0x1041511ab]
      +                                   1829 engine_unlocked_init  (in libcrypto.1.0.0.dylib) + 59  [0x1041501eb]
      +                                     1829 ctx_init  (in libpkcs11.dylib) + 86  [0x1043647f6]
      +                                       1829 ctx_init_libp11_unlocked  (in libpkcs11.dylib) + 96  [0x104364890]
      +                                         1829 pkcs11_CTX_load  (in libpkcs11.dylib) + 101  [0x1043699c5]
      +                                           1829 C_Initialize  (in libsofthsm2.so) + 25  [0x104378909]
      +                                             1829 SoftHSM::C_Initialize(void*)  (in libsofthsm2.so) + 994  [0x1043922e2]
      +                                               1829 SlotManager::SlotManager(ObjectStore*)  (in libsofthsm2.so) + 506  [0x10440529a]
. . . . .
Total number in stack (recursive counted multiple, when >=5):
        510       C_Initialize  (in libsofthsm2.so) + 25  [0x104378909]
        509       Slot::Slot(ObjectStore*, unsigned long, ObjectStoreToken*)  (in libsofthsm2.so) + 63  [0x104405f7f]
        509       pkcs11_CTX_load  (in libpkcs11.dylib) + 101  [0x1043699c5]
        508       SlotManager::insertToken(ObjectStore*, unsigned long, ObjectStoreToken*)  (in libsofthsm2.so) + 56  [0x1044053f8]
        508       ctx_init_libp11_unlocked  (in libpkcs11.dylib) + 96  [0x104364890]
        507       SlotManager::SlotManager(ObjectStore*)  (in libsofthsm2.so) + 506  [0x10440529a]
        507       SoftHSM::C_Initialize(void*)  (in libsofthsm2.so) + 994  [0x1043922e2]
        507       ctx_init  (in libpkcs11.dylib) + 86  [0x1043647f6]
        506       engine_unlocked_init  (in libcrypto.1.0.0.dylib) + 59  [0x1041501eb]
        505       RAND_bytes  (in libcrypto.1.0.0.dylib) + 60  [0x10416183c]
        505       engine_table_select  (in libcrypto.1.0.0.dylib) + 267  [0x1041511ab]
        504       OSSLRNG::generateRandom(ByteString&, unsigned long)  (in libsofthsm2.so) + 49  [0x1043d8ca1]
        503       SecureDataManager::initObject()  (in libsofthsm2.so) + 114  [0x1043e16d2]
        502       SecureDataManager::SecureDataManager(ByteString const&, ByteString const&)  (in libsofthsm2.so) + 86  [0x1043e1a06]
        501       Token::Token(ObjectStoreToken*)  (in libsofthsm2.so) + 136  [0x1044066e8]
        48       malloc  (in libsystem_malloc.dylib) + 24  [0x7fffa8b69200]
        46       operator new(unsigned long)  (in libc++abi.dylib) + 30  [0x7fffa75f2e0e]
        44       malloc_zone_malloc  (in libsystem_malloc.dylib) + 107  [0x7fffa8b6a282]
        31       File::readULong(unsigned long&)  (in libsofthsm2.so) + 50  [0x1043e8bd2]
        25       std::__1::vector<unsigned char, SecureAllocator<unsigned char> >::__append(unsigned long)  (in libsofthsm2.so) + 273  [0x1043e10c1]
        24       ObjectFile::isValid()  (in libsofthsm2.so) + 16  [0x1043f30b0]
        21       File::readULong(unsigned long&)  (in libsofthsm2.so) + 83  [0x1043e8bf3]
        21       szone_malloc_should_clear  (in libsystem_malloc.dylib) + 400  [0x7fffa8b6a472]
        20       File::readULong(unsigned long&)  (in libsofthsm2.so) + 179  [0x1043e8c53]
        20       OSToken::index(bool)  (in libsofthsm2.so) + 65  [0x1043ebea1]
        20       ObjectFile::refresh(bool)  (in libsofthsm2.so) + 106  [0x1043f066a]
        20       szone_malloc_should_clear  (in libsystem_malloc.dylib) + 0  [0x7fffa8b6a2e2]
        19       ImageLoader::recursiveUpdateDepth(unsigned int)  (in dyld) + 111  [0x108a92797]
        18       fread  (in libsystem_c.dylib) + 48  [0x7fffa8a5c380]
        17       ObjectFile::ObjectFile(OSToken*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_strin
g<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool)  (in libsofthsm2.so) + 344  [0x1043f0598]
        15       SecureMemoryRegistry::add(void*, unsigned long)  (in libsofthsm2.so) + 0  [0x1043e4b40]
        14       _platform_memmove$VARIANT$Haswell  (in libsystem_platform.dylib) + 0  [0x7fffa8bfaea0]
        14       pthread_mutex_lock  (in libsystem_pthread.dylib) + 0  [0x7fffa8bff38c]
. . . . .
Sort by top of stack, same collapsed (when >= 5):
        __open  (in libsystem_kernel.dylib)        1766
        mach_msg_trap  (in dyld)        64
        __close_nocancel  (in libsystem_kernel.dylib)        61
        lstat$INODE64  (in libsystem_kernel.dylib)        60
        __getdirentries64  (in libsystem_kernel.dylib)        46
        SecureMemoryRegistry::add(void*, unsigned long)  (in libsofthsm2.so)        35
        SecureMemoryRegistry::remove(void*)  (in libsofthsm2.so)        30
        __fcntl  (in libsystem_kernel.dylib)        29
        szone_malloc_should_clear  (in libsystem_malloc.dylib)        25
        __open_nocancel  (in libsystem_kernel.dylib)        23
        fstat$INODE64  (in libsystem_kernel.dylib)        22
        pthread_mutex_lock  (in libsystem_pthread.dylib)        18
        free_tiny  (in libsystem_malloc.dylib)        16
        _platform_memmove$VARIANT$Haswell  (in libsystem_platform.dylib)        15
        tiny_malloc_from_free_list  (in libsystem_malloc.dylib)        15
        __read_nocancel  (in libsystem_kernel.dylib)        14
        _pthread_mutex_unlock_slow  (in libsystem_pthread.dylib)        9
        pthread_mutex_unlock  (in libsystem_pthread.dylib)        9
        szone_size  (in libsystem_malloc.dylib)        9
        free  (in libsystem_malloc.dylib)        8
        __fcntl_nocancel  (in libsystem_kernel.dylib)        6
        set_tiny_meta_header_in_use  (in libsystem_malloc.dylib)        6
        tiny_free_no_lock  (in libsystem_malloc.dylib)        6
        File::File(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, bool, bool, bool)  (in libsofthsm2.so)        5
        ObjectFile::refresh(bool)  (in libsofthsm2.so)        5
        default_zone_free_definite_size  (in libsystem_malloc.dylib)        5
        malloc  (in libsystem_malloc.dylib)        5
. . . . .

And the complete trace: Sample of evp-sign.txt

P.S. Might I suggest that it may not be a good idea merging something that fails Travis CI? Yes I know - it failed because SoftHSMv2 was looking for EDDSA that the available OpenSSL did not provide/support. Still, if the tester fixed it and went past that issue - I'm sure he'd hit the failure I experienced.

williamcroberts commented 6 years ago

P.S. it did pass at one point: https://travis-ci.org/OpenSC/libp11/jobs/360342812 It currently passes locally for me as well.

I'm not sure what magic your config has. I'm on ubuntu 16.04.4 I have SoftHSMv2 installed from master with --disable-gost and --disable-eddsa I have OpenSC installed from master libp11-proxy-kit from packagemanager I have libp11 built locally on master with the change in question.

I run make check and everything passes. I put an assert in the rand callback and it doesn't look like it's getting called.

mouse07410 commented 6 years ago

P.S. it did pass at one point: https://travis-ci.org/OpenSC/libp11/jobs/360342812

Well, it doesn't now.

Travis CI still fails on Clang: https://travis-ci.org/OpenSC/libp11/builds/362317164?utm_source=github_status&utm_medium=notification

It would be nice to reach a point where Travis CI consistently passes.

It currently passes locally for me as well.

Great, but it doesn't help others like myself.

I'm not sure what magic your config has.

./configure --prefix=/opt/local --with-pkcs11-module=/opt/local/lib/p11-kit-proxy.dylib --with-enginesdir=/opt/local/lib/engines

I'm on ubuntu 16.04.4

I'm on MacOS 10.12.6 and 10.13.4.

I have SoftHSMv2 installed from master with --disable-gost and --disable-eddsa

I have SoftHSMv2 installed from master with --disable-eddsa.

Update: so my build has GOST enabled, and yours doesn't. One implication of that is that the line you referred to in SoftHSMv does not get compiled/executed in my case.

I have OpenSC installed from master

Likewise.

libp11-proxy-kit from packagemanager

Likewise, except on my platforms it comes from Macports. Regardless, it is 0.23.10_1.

I have libp11 built locally on master with the change in question.

Likewise - except that when I build without the change in question, everything works as expected. When I build with the change in question - I get that infinite loop.

williamcroberts commented 6 years ago

So I placed an

assert(0)

in the random engine call back and it still passes. So my random code is not getting called, so my config is not matching yours.

I see you said you have GOST enabled. So Ill have try that, But I think it has to do with the configure command, you had:

--with-pkcs11-module=/opt/local/lib/p11-kit-proxy.dylib --with-enginesdir=/opt/local/lib/engines

I did:

--with-pkcs11-module=/usr/lib/x86_64-linux-gnu/p11-kit-proxy.so --with-enginesdir=/usr/lib/engines

Which my engines dir is empty.

I also likely don't have that pkcs11 proxy configured properly, how do you have that setup?

Do you want to try that propesed change on SoftHSMv2 while I try and replicate?

FYI: master Travis is passing on this PR now: https://github.com/OpenSC/libp11/pull/216

mouse07410 commented 6 years ago

my engines dir is empty

On your machine (Linux) the default system-wide OpenSSL apparently is of a decent version (some 1.0.2).

On my machine (MacOS) the default system-wide OpenSSL is OpenSSL 0.9.8zh 14 Jan 2016. So I have to use Macports-installed OpenSSL that lives in /opt/local, and point all the software that I install to it instead of /usr/bin/openssl and ilk.

I also likely don't have that pkcs11 proxy configured properly, how do you have that setup?

That's probably not critical, but in my case I have a directory ~/.config/pkcs11/modules that lists what PKCS#11 modules are available on my system:

$ ll ~/.config/pkcs11/modules
total 32
drwxr-xr-x  6 ur20980  MITLL\Domain Users  204 Apr  4 16:35 ./
drwxr-xr-x  4 ur20980  MITLL\Domain Users  136 Feb  1 14:40 ../
-rw-r--r--  1 ur20980  MITLL\Domain Users  107 Feb  1 14:53 opensc.module
-rw-r--r--  1 ur20980  MITLL\Domain Users  160 Apr  4 16:35 softhsm2.module
-rw-r--r--  1 ur20980  MITLL\Domain Users  310 Feb  1 14:42 yhsm2.module
-rw-r--r--  1 ur20980  MITLL\Domain Users   39 Feb  1 14:54 ykcs11.module

Each module descriptor file points at the right module, like:

$ cat ~/.config/pkcs11/modules/opensc.module 
module: /Library/OpenSC/lib/opensc-pkcs11.dylib
#module: /Library/OpenSC/lib/pkcs11-spy.dylib
critical: no

Note: what's .so on Linux, is .dylib on Mac. ;-)

Do you want to try that proposed change on SoftHSMv2 while I try and replicate?

Sure, what's the proposed change exactly?

FYI: master Travis is passing on this PR now: #216

Excellent, thanks!

mouse07410 commented 6 years ago

Bingo! This is what made the code work, and the tests to pass:

diff --git a/src/lib/crypto/OSSLCryptoFactory.cpp b/src/lib/crypto/OSSLCryptoFactory.cpp
index 8555815..61ce056 100644
--- a/src/lib/crypto/OSSLCryptoFactory.cpp
+++ b/src/lib/crypto/OSSLCryptoFactory.cpp
@@ -137,10 +137,24 @@ OSSLCryptoFactory::OSSLCryptoFactory()
        }
        FipsSelfTestStatus = true;
 #endif
-
        // Initialise OpenSSL
        OpenSSL_add_all_algorithms();

+       // Make sure RDRAND is loaded first
+       ENGINE_load_rdrand();
+       rdrand_engine = ENGINE_by_id("rdrand");
+       if ( rdrand_engine == NULL ) {
+               fprintf(stderr, "ENGINE_load_rdrand returned %lu\n", ERR_get_error());
+       }
+       if ( ! ENGINE_init(rdrand_engine) ) {
+               fprintf(stderr, "ENGINE_init returned %lu\n", ERR_get_error());
+       }
+       if ( ! ENGINE_set_default(rdrand_engine, ENGINE_METHOD_RAND) ) {
+               fprintf(stderr, "ENGINE_set_default returned %lu\n", ERR_get_error());
+       }
+
+
+
        // Initialise the one-and-only RNG
        rng = new OSSLRNG();

@@ -150,6 +164,10 @@ OSSLCryptoFactory::OSSLCryptoFactory()
        ENGINE_load_builtin_engines();
 #else
        OPENSSL_init_crypto(OPENSSL_INIT_ENGINE_ALL_BUILTIN |
+                           OPENSSL_INIT_ENGINE_RDRAND |
+                           OPENSSL_INIT_LOAD_CRYPTO_STRINGS |
+                           OPENSSL_INIT_ADD_ALL_CIPHERS |
+                           OPENSSL_INIT_ADD_ALL_DIGESTS |
                            OPENSSL_INIT_LOAD_CONFIG, NULL);
 #endif

diff --git a/src/lib/crypto/OSSLCryptoFactory.h b/src/lib/crypto/OSSLCryptoFactory.h
index e8bfa2c..8dfb128 100644
--- a/src/lib/crypto/OSSLCryptoFactory.h
+++ b/src/lib/crypto/OSSLCryptoFactory.h
@@ -105,6 +105,8 @@ private:

        // The one-and-only RNG instance
        RNG* rng;
+       // And RDRAND engine to use with it
+       ENGINE *rdrand_engine;

 #ifdef WITH_GOST
        // The GOST engine

Now the question is what to do, as I'm not sure SoftHSMv2 would accept this patch. Update: but in any case I've reported it here.

mouse07410 commented 6 years ago

my engines dir is empty

@williamcroberts I just realized: that means you're installing your libp11 (pkcs11 engine) into a wrong directory, where OpenSSL probably cannot see it. As a result, it may be that the tests are performed on the system-installed libp11 (assuming it does get installed). Because OpenSSL

Here's what my /opt/local/lib/engines directory looks like:

$ ll /opt/local/lib/engines/
total 1272
drwxr-xr-x    19 root  admin     608 Apr  4 20:50 ./
drwxr-xr-x  1346 root  admin   43072 Apr  4 20:50 ../
-r-xr-xr-x     1 root  admin   42236 Mar 28 17:57 lib4758cca.dylib*
-r-xr-xr-x     1 root  admin   37900 Mar 28 17:57 libaep.dylib*
-r-xr-xr-x     1 root  admin   37716 Mar 28 17:57 libatalla.dylib*
-r-xr-xr-x     1 root  admin   29812 Mar 28 17:57 libcapi.dylib*
-r-xr-xr-x     1 root  admin   44300 Mar 28 17:57 libchil.dylib*
-r-xr-xr-x     1 root  admin   42380 Mar 28 17:57 libcswift.dylib*
-r-xr-xr-x     1 root  admin   29812 Mar 28 17:57 libgmp.dylib*
-r-xr-xr-x     1 root  admin  117984 Mar 28 17:57 libgost.dylib*
-r-xr-xr-x     1 root  admin   37116 Mar 28 17:57 libnuron.dylib*
-r-xr-xr-x     1 root  admin   29812 Mar 28 17:57 libpadlock.dylib*
lrwxr-xr-x     1 root  admin      12 Apr  4 20:50 libpkcs11.dylib@ -> pkcs11.dylib
lrwxr-xr-x     1 root  admin       9 Feb 29  2016 libpkcs11.so@ -> pkcs11.so
-r-xr-xr-x     1 root  admin   48212 Mar 28 17:57 libsureware.dylib*
-r-xr-xr-x     1 root  admin   42292 Mar 28 17:57 libubsec.dylib*
-rwxr-xr-x     1 root  admin   74308 Apr  4 20:50 pkcs11.dylib*
-rwxr-xr-x     1 root  admin     932 Apr  4 20:50 pkcs11.la*
lrwxr-xr-x     1 root  admin      35 Oct  7  2016 pkcs11.so@ -> /opt/local/lib/engines/pkcs11.dylib

Most of them are installed by the OpenSSL distro itself. So if your /usr/lib/engines is empty - it suggests to me that OpenSSL uses a different directory to install and retrieve its engines from.

williamcroberts commented 6 years ago

@mouse07410 I have been using an openssl conf engine to set where to load the engine from, and conversely what pkcs11.so to load up as well:

openssl_conf = openssl_init

[openssl_init]
engines=engine_section

[engine_section]
pkcs11 = pkcs11_section

[pkcs11_section]
engine_id = pkcs11
dynamic_path = /home/wcrobert/workspace/libp11/src/.libs/pkcs11.so
MODULE_PATH = /home/wcrobert/workspace/tpm2-pkcs11/src/.libs/libtpm2-pkcs11.so.0.0.0
williamcroberts commented 6 years ago

Looks like tests have an OpenSSL config file file that gets processed with sed and it sets the dynamic_path to the local built libp11 pkcs#11 path. It looks like the configure option --with-pkcs11-module sets the MODULE_PATH variable in the config file.

I configure with ./configure --with-pkcs11-module=/home/wcrobert/workspace/SoftHSMv2/src/lib/.libs/libsofthsm2.so --with-enginesdir=/home/wcrobert/workspace/libp11/src/.libs

However, I had the test spit out the engine config file as a copy called ~/foo.conf, and it has it pointing to a system installed softhsm:

$ cat ~/foo.conf 
HOME            = .
RANDFILE        = $ENV::HOME/.rnd

openssl_conf = openssl_def

[openssl_def]
engines = engine_section

[engine_section]
pkcs11 = pkcs11_section

[pkcs11_section]
engine_id = pkcs11
dynamic_path = ../src/.libs/pkcs11.so
MODULE_PATH = /usr/local/lib/softhsm/libsofthsm2.so
init = 0

I'm just going to hard code that sed script and see if I get it to reproduce.

williamcroberts commented 6 years ago

@mouse07410 I have this configured to a point where I could replicate it. So while your patch avoids the problem, it ends up making it so all rand() calls go back to the software implementation.

This looks like a limitation in OpenSSL design. I think the fix here is to add an engine command, to enable random support and turn it off by default. Ill propose a patch.

williamcroberts commented 6 years ago

I still don't know why we don't see this on Travis.....

mouse07410 commented 6 years ago

while your patch avoids the problem, it ends up making it so all rand() calls go back to the software implementation

Are you sure? My patch should only affect SoftHSMv2, which is a software token to begin with. You can't expect to access some hardware with libsofthsm2.so. And my patch only sets RDRAND as the default RNG - i.e., any application/caller should be free to set another engine within that application scope. When your application uses pkcs11.so and a real hardware token, my patch cannot be a factor - so if #214 works at all, rand() then should get random bytes from that token. Or do I misunderstand what you're saying?

This looks like a limitation in OpenSSL design.

That is possible - but without a better (than I have) understanding of what's going on, i.e., what causes this recursion, I can't "confirm or deny". ;-)

I think the fix here is to add an engine command, to enable random support and turn it off by default.

I doubt that it would be a good fix. I think we should figure out what goes wrong, especially with OpenSSL-1.1.x - and address it, probably in libp11, or in OpenSSL itself if that's where the fix must go to.

I still don't know why we don't see this on Travis.....

Because you only test a certain version of Linux and OpenSSL on Travis, that's why. If, e.g., you added MacOS and/or OpenSSL-1.1.x - I'm sure Travis would've entertained you better. ;-)

OK, now the situation as I see it:

  1. Something seems wrong with the RNG access. This only manifests when #214 is applied - without it everything seems to work fine in any combination of OS, OpenSSL (1.0.2o, 1.1.0-stable, 1.1.1-pre) and SoftHSMv2 (master and develop branches).

  2. After applying to SoftHSMv2 the patch I showed above that enforces loading of RDRAND_engine and setting as the default RNG, #214 passes the tests when built for OpenSSL-1.0.2o. Without that patch, both evp-sign (infinite recursion) and fork-change-slot (pkcs11_check_token:General Error:p11_slot.c:487) fail, both in RAND_bytes() that points at #214 as either a culprit, or at least a "partner". This problem manifests itself on MacOS for sure, probably on Linux too if you were able to replicate it.

  3. No matter what, I cannot make fork-change-slot pass with OpenSSL-1.1.x (error as shown above).

The solution should be figuring out what is wrong with the way #214 configures PKCS#11 access. Because there's at least one example of the engine (RDRAND_engine) that appears to work fine with all the OS and OpenSSL versions. So it's possible, and OpenSSL isn't screwed up so far as to prevent getting random numbers from an engine.

mouse07410 commented 6 years ago

I'd like to add, that according to my circumstantial evidence, #214 does retrieve random bytes from a hardware token with OpenSSL-1.0.2 (LED on my token blinks :).

I also confirm that it does not retrieve anything from a hardware token with OpenSSL-1.1.x (either stable or master branch), and the LED does not blink in this case.

ansasaki commented 6 years ago

@mouse07410 The infinite loop in fork-change-slot occurs because in the test the engine is set to be default for all methods. When it calls RAND_bytes(), the SoftHSM RNG is called, which calls the default OpenSSL RNG method. Since it registered the engine as the default method, it will end up in the infinite loop.

The fix for the test (and for applications using the SoftHSM) is to not set the engine as default for RAND methods, like this:

ENGINE_set_default(e, ENGINE_METHOD_ALL & ~ENGINE_METHOD_RAND);

I fixed the test in #224.

mouse07410 commented 6 years ago

@ansasaki thank you. Your change addresses the infinite loop problem. If an application wants to get random bytes from the token/engine, what should it do?

ansasaki commented 6 years ago

@mouse07410 I don't have the answer right now, I'll take a look into the code (and try with some hardware).

williamcroberts commented 6 years ago

@mouse07410 if the change here fixes the issue, then i'm dropping this.

https://github.com/OpenSC/libp11/blob/master/src/eng_front.c#L244 is still setting the rand method for the pkcs11 engine wrapper, so it should still get the random data.

ansasaki commented 6 years ago

@mouse07410 This line may explain why you cannot use your device to generate random bytes.

I was trying to generate random using a token I have, but the call never reached the right PKCS#11 device because it always picks the first one which have RNG. The default module is the p11kit-proxy which handles a lot of modules under it.

For me the engine should allow me to set the exact device I want to use to generate random bytes and not only the module handling it. Probably through a CTRL command, like:

ENGINE_ctrl_cmd(engine, "RNG_DEVICE", 0, "pkcs11:token=my_token", NULL, 1)

This would allow the device to be used for RNG in the configuration file as well.

mtrojnar commented 6 years ago

I reverted 6216f16b93685a8c3af501ed2217ec817e48748f in 3a8d1c65f0153a6281a25040b3a97deababcee3a, which fixes this issue.

mouse07410 commented 6 years ago

Hmm, killing the entire capability seems a bit too radical a solution. My current fork (everything you did up till now, but reverting your revert) seems to work OK with OpenSSL-1.0.2o and OpenSSL-1.1.x master.

mtrojnar commented 6 years ago

This new feature seemed to cause too many problems. I'm also not satisfied with code quality: the implementation was inconsistent with existing engine methods. I will probably reimplement it from scratch after the upcoming release.