Closed mouse07410 closed 7 years ago
Update
The above showed that running under PKCS11SPY with PKCS11SPY=/Library/OpenSC/lib/opensc-pkcs11.dylib
exhibited correct behavior.
Now I tried running under PKCS11SPY with PKCS11SPY=/opt/local/lib/p11-kit-proxy.dylib
, and sure enough, the hangup at the end of processing is there:
$ PKCS11_MODULE_PATH=/Library/OpenSC/lib/pkcs11-spy.dylib openssl dgst -engine pkcs11 -keyform engine -sign "pkcs11:manufacturer=piv_II;object=SIGN%20key;type=private" -sha384 -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -out t256.dat.sig t256.dat
*************** OpenSC PKCS#11 spy *****************
Loaded: "/opt/local/lib/p11-kit-proxy.dylib"
0: C_GetFunctionList
2016-10-15 22:12:46.121
Returned: 0 CKR_OK
1: C_Initialize
2016-10-15 22:12:46.122
[in] pInitArgs = 0x7fff5731e5f0
flags: 2
CKF_OS_LOCKING_OK
Returned: 0 CKR_OK
2: C_GetInfo
2016-10-15 22:12:46.919
[out] pInfo:
cryptokiVersion: 2.20
manufacturerID: 'PKCS#11 Kit '
flags: 0
libraryDescription: 'PKCS#11 Kit Proxy Module '
libraryVersion: 1.1
Returned: 0 CKR_OK
3: C_GetSlotList
2016-10-15 22:12:46.919
[in] tokenPresent = 0x0
[out] pSlotList:
Count is 7
[out] *pulCount = 0x7
Returned: 0 CKR_OK
4: C_GetSlotList
2016-10-15 22:12:46.919
[in] tokenPresent = 0x0
[out] pSlotList:
Slot 16
Slot 17
Slot 18
Slot 19
Slot 20
Slot 21
Slot 22
[out] *pulCount = 0x7
Returned: 0 CKR_OK
5: C_GetSlotInfo
2016-10-15 22:12:46.919
[in] slotID = 0x10
[out] pInfo:
slotDescription: '/opt/local/share/curl/curl-ca-bu'
'ndle.crt '
manufacturerID: 'PKCS#11 Kit '
hardwareVersion: 0.23
firmwareVersion: 0.0
flags: 1
CKF_TOKEN_PRESENT
Returned: 0 CKR_OK
6: C_GetTokenInfo
2016-10-15 22:12:46.919
[in] slotID = 0x10
[out] pInfo:
label: 'Default Trust '
manufacturerID: 'PKCS#11 Kit '
model: 'p11-kit-trust '
serialNumber: '1 '
ulMaxSessionCount: 0
ulSessionCount: -1
ulMaxRwSessionCount: 0
ulRwSessionCount: -1
ulMaxPinLen: 0
ulMinPinLen: 0
ulTotalPublicMemory: -1
ulFreePublicMemory: -1
ulTotalPrivateMemory: -1
ulFreePrivateMemory: -1
hardwareVersion: 0.23
firmwareVersion: 0.0
time: ' '
flags: 402
CKF_WRITE_PROTECTED
CKF_TOKEN_INITIALIZED
Returned: 0 CKR_OK
7: C_GetSlotInfo
2016-10-15 22:12:46.919
[in] slotID = 0x11
[out] pInfo:
slotDescription: '/opt/local/etc/openssl '
' '
manufacturerID: 'PKCS#11 Kit '
hardwareVersion: 0.23
firmwareVersion: 0.0
flags: 1
CKF_TOKEN_PRESENT
Returned: 0 CKR_OK
8: C_GetTokenInfo
2016-10-15 22:12:46.920
[in] slotID = 0x11
[out] pInfo:
label: 'System Trust '
manufacturerID: 'PKCS#11 Kit '
model: 'p11-kit-trust '
serialNumber: '1 '
ulMaxSessionCount: 0
ulSessionCount: -1
ulMaxRwSessionCount: 0
ulRwSessionCount: -1
ulMaxPinLen: 0
ulMinPinLen: 0
ulTotalPublicMemory: -1
ulFreePublicMemory: -1
ulTotalPrivateMemory: -1
ulFreePrivateMemory: -1
hardwareVersion: 0.23
firmwareVersion: 0.0
time: ' '
flags: 402
CKF_WRITE_PROTECTED
CKF_TOKEN_INITIALIZED
Returned: 0 CKR_OK
9: C_GetSlotInfo
2016-10-15 22:12:46.920
[in] slotID = 0x12
[out] pInfo:
slotDescription: 'Yubico Yubikey NEO OTP+U2F+CCID '
' '
manufacturerID: 'Yubico '
hardwareVersion: 3.70
firmwareVersion: 0.0
flags: 7
CKF_TOKEN_PRESENT
CKF_REMOVABLE_DEVICE
CKF_HW_SLOT
Returned: 0 CKR_OK
10: C_GetTokenInfo
2016-10-15 22:12:46.922
[in] slotID = 0x12
[out] pInfo:
label: 'PIV Card Holder pin (PIV_II) '
manufacturerID: 'piv_II '
model: 'PKCS#15 emulated'
serialNumber: 'a0fxxxxxxxxxxxxx'
ulMaxSessionCount: 0
ulSessionCount: 0
ulMaxRwSessionCount: 0
ulRwSessionCount: 0
ulMaxPinLen: 8
ulMinPinLen: 4
ulTotalPublicMemory: -1
ulFreePublicMemory: -1
ulTotalPrivateMemory: -1
ulFreePrivateMemory: -1
hardwareVersion: 0.0
firmwareVersion: 0.0
time: ' '
flags: 40d
CKF_RNG
CKF_LOGIN_REQUIRED
CKF_USER_PIN_INITIALIZED
CKF_TOKEN_INITIALIZED
Returned: 0 CKR_OK
11: C_GetSlotInfo
2016-10-15 22:12:46.952
[in] slotID = 0x13
[out] pInfo:
slotDescription: 'SoftHSM slot ID 0x21bc4979 '
' '
manufacturerID: 'SoftHSM project '
hardwareVersion: 2.1
firmwareVersion: 2.1
flags: 1
CKF_TOKEN_PRESENT
Returned: 0 CKR_OK
12: C_GetTokenInfo
2016-10-15 22:12:46.952
[in] slotID = 0x13
[out] pInfo:
label: 'Botan PKCS#11 tests '
manufacturerID: 'SoftHSM project '
model: 'SoftHSM v2 '
serialNumber: 'b158018a21bc4979'
ulMaxSessionCount: 0
ulSessionCount: -1
ulMaxRwSessionCount: 0
ulRwSessionCount: -1
ulMaxPinLen: 255
ulMinPinLen: 4
ulTotalPublicMemory: -1
ulFreePublicMemory: -1
ulTotalPrivateMemory: -1
ulFreePrivateMemory: -1
hardwareVersion: 2.1
firmwareVersion: 2.1
time: '2016101602124600'
flags: 42d
CKF_RNG
CKF_LOGIN_REQUIRED
CKF_USER_PIN_INITIALIZED
CKF_RESTORE_KEY_NOT_NEEDED
CKF_TOKEN_INITIALIZED
Returned: 0 CKR_OK
13: C_GetSlotInfo
2016-10-15 22:12:46.953
[in] slotID = 0x14
[out] pInfo:
slotDescription: 'SoftHSM slot ID 0x2879828e '
' '
manufacturerID: 'SoftHSM project '
hardwareVersion: 2.1
firmwareVersion: 2.1
flags: 1
CKF_TOKEN_PRESENT
Returned: 0 CKR_OK
14: C_GetTokenInfo
2016-10-15 22:12:46.953
[in] slotID = 0x14
[out] pInfo:
label: 'test '
manufacturerID: 'SoftHSM project '
model: 'SoftHSM v2 '
serialNumber: '02b891b82879828e'
ulMaxSessionCount: 0
ulSessionCount: -1
ulMaxRwSessionCount: 0
ulRwSessionCount: -1
ulMaxPinLen: 255
ulMinPinLen: 4
ulTotalPublicMemory: -1
ulFreePublicMemory: -1
ulTotalPrivateMemory: -1
ulFreePrivateMemory: -1
hardwareVersion: 2.1
firmwareVersion: 2.1
time: '2016101602124600'
flags: 42d
CKF_RNG
CKF_LOGIN_REQUIRED
CKF_USER_PIN_INITIALIZED
CKF_RESTORE_KEY_NOT_NEEDED
CKF_TOKEN_INITIALIZED
Returned: 0 CKR_OK
15: C_GetSlotInfo
2016-10-15 22:12:46.953
[in] slotID = 0x15
[out] pInfo:
slotDescription: 'SoftHSM slot ID 0x2 '
' '
manufacturerID: 'SoftHSM project '
hardwareVersion: 2.1
firmwareVersion: 2.1
flags: 1
CKF_TOKEN_PRESENT
Returned: 0 CKR_OK
16: C_GetTokenInfo
2016-10-15 22:12:46.953
[in] slotID = 0x15
[out] pInfo:
label: ' '
manufacturerID: 'SoftHSM project '
model: 'SoftHSM v2 '
serialNumber: ' '
ulMaxSessionCount: 0
ulSessionCount: -1
ulMaxRwSessionCount: 0
ulRwSessionCount: -1
ulMaxPinLen: 255
ulMinPinLen: 4
ulTotalPublicMemory: -1
ulFreePublicMemory: -1
ulTotalPrivateMemory: -1
ulFreePrivateMemory: -1
hardwareVersion: 2.1
firmwareVersion: 2.1
time: '2016101602124600'
flags: c00025
CKF_RNG
CKF_LOGIN_REQUIRED
CKF_RESTORE_KEY_NOT_NEEDED
CKF_SO_PIN_LOCKED
CKF_SO_PIN_TO_BE_CHANGED
Returned: 0 CKR_OK
17: C_GetSlotInfo
2016-10-15 22:12:46.953
[in] slotID = 0x16
[out] pInfo:
slotDescription: ' Yubico Yubikey NEO OTP+U2F+CCID'
' '
manufacturerID: 'Yubico '
hardwareVersion: 1.0
firmwareVersion: 1.0
flags: 5
CKF_TOKEN_PRESENT
CKF_HW_SLOT
Returned: 0 CKR_OK
18: C_GetTokenInfo
2016-10-15 22:12:46.953
[in] slotID = 0x16
[out] pInfo:
label: 'YubiKey PIV '
manufacturerID: 'Yubico '
model: 'YubiKey NEO '
serialNumber: '1234 '
ulMaxSessionCount: -1
ulSessionCount: -1
ulMaxRwSessionCount: -1
ulRwSessionCount: -1
ulMaxPinLen: 8
ulMinPinLen: 6
ulTotalPublicMemory: -1
ulFreePublicMemory: -1
ulTotalPrivateMemory: -1
ulFreePrivateMemory: -1
hardwareVersion: 0.0
firmwareVersion: 1.4
time: ' '
flags: 40d
CKF_RNG
CKF_LOGIN_REQUIRED
CKF_USER_PIN_INITIALIZED
CKF_TOKEN_INITIALIZED
Returned: 0 CKR_OK
engine "pkcs11" set.
19: C_OpenSession
2016-10-15 22:12:46.954
[in] slotID = 0x12
[in] flags = 0x4
pApplication=0x0
Notify=0x0
[out] *phSession = 0x11
Returned: 0 CKR_OK
20: C_FindObjectsInit
2016-10-15 22:12:46.954
[in] hSession = 0x11
[in] pTemplate[1]:
CKA_CLASS CKO_CERTIFICATE
Returned: 0 CKR_OK
21: C_FindObjects
2016-10-15 22:12:46.954
[in] hSession = 0x11
[in] ulMaxObjectCount = 0x1
[out] ulObjectCount = 0x1
Object 0x7fcbfdf0af90 matches
Returned: 0 CKR_OK
22: C_GetAttributeValue
2016-10-15 22:12:46.954
[in] hSession = 0x11
[in] hObject = 0x7fcbfdf0af90
[in] pTemplate[1]:
CKA_CERTIFICATE_TYPE 00007fff5731e498 / 8
[out] pTemplate[1]:
CKA_CERTIFICATE_TYPE CKC_X_509
Returned: 0 CKR_OK
23: C_GetAttributeValue
2016-10-15 22:12:46.954
[in] hSession = 0x11
[in] hObject = 0x7fcbfdf0af90
[in] pTemplate[1]:
CKA_LABEL 0000000000000000 / 0
[out] pTemplate[1]:
CKA_LABEL 0000000000000000 / 34
Returned: 0 CKR_OK
24: C_GetAttributeValue
2016-10-15 22:12:46.954
[in] hSession = 0x11
[in] hObject = 0x7fcbfdf0af90
[in] pTemplate[1]:
CKA_LABEL 00007fcbfdf0f510 / 34
[out] pTemplate[1]:
CKA_LABEL 00007fcbfdf0f510 / 34
43657274 69666963 61746520 666F7220 50495620 41757468 656E7469 63617469
C e r t i f i c a t e . f o r . P I V . A u t h e n t i c a t i6F6E
o n
Returned: 0 CKR_OK
25: C_GetAttributeValue
2016-10-15 22:12:46.954
[in] hSession = 0x11
[in] hObject = 0x7fcbfdf0af90
[in] pTemplate[1]:
CKA_VALUE 0000000000000000 / 0
[out] pTemplate[1]:
CKA_VALUE 0000000000000000 / 932
Returned: 0 CKR_OK
26: C_GetAttributeValue
2016-10-15 22:12:46.954
[in] hSession = 0x11
[in] hObject = 0x7fcbfdf0af90
[in] pTemplate[1]:
CKA_VALUE 00007fcbfdf0f6d0 / 932
[out] pTemplate[1]:
CKA_VALUE 00007fcbfdf0f6d0 / 932
00000000 30 82 03 A0 30 82 02 08 A0 03 02 01 02 02 04 57 0...0..........W
00000010 A9 DF B6 30 0D 06 09 2A 86 48 86 F7 0D 01 01 0B ...0...*.H......
. . . . .
98: C_Sign
2016-10-15 22:13:17.458
[in] hSession = 0x11
[in] pData[ulDataLen] 00007fcbfdc0ef50 / 256
00000000 08 11 D9 8F F6 69 15 9F 02 6E EC 1A EE 5F 80 E5 .....i...n..._..
00000010 9A F7 38 07 5A 1B B9 CD BF 34 E7 3C D9 AD FB 2C ..8.Z....4.<...,
. . . . .
000000F0 B9 E5 7C 92 53 D6 65 46 88 45 EA 8E EE 48 F5 BC ..|.S.eF.E...H..
[out] pSignature[*pulSignatureLen] 00007fcbff000000 / 256
00000000 8E 46 DF 0D 0A 8D ED F1 A3 6D 6C AC 0E 8B EE 5E .F.......ml....^
00000010 D9 E5 47 E3 75 3F 13 E8 A6 FA 96 13 CD 2A 33 AE ..G.u?.......*3.
. . . . .
000000F0 43 5F 04 C2 62 F2 BA 1F 0A F4 C5 A0 AA D4 C5 A2 C_..b...........
Returned: 0 CKR_OK
99: C_CloseAllSessions
2016-10-15 22:13:18.129
[in] slotID = 0x10
Returned: 0 CKR_OK
100: C_CloseAllSessions
2016-10-15 22:13:18.129
[in] slotID = 0x11
Returned: 0 CKR_OK
101: C_CloseAllSessions
2016-10-15 22:13:18.129
[in] slotID = 0x12
Returned: 0 CKR_OK
102: C_CloseAllSessions
2016-10-15 22:13:18.129
[in] slotID = 0x13
Returned: 0 CKR_OK
103: C_CloseAllSessions
2016-10-15 22:13:18.129
[in] slotID = 0x14
Returned: 0 CKR_OK
104: C_CloseAllSessions
2016-10-15 22:13:18.129
[in] slotID = 0x15
Returned: 0 CKR_OK
105: C_CloseAllSessions
2016-10-15 22:13:18.129
[in] slotID = 0x16
Returned: 0 CKR_OK
106: C_Finalize
2016-10-15 22:13:18.129
^C
$
The problem appears to be related to p11-kit
interaction with libp11
and/or the actual PKCS#11 library.
The standard way to proceed would be to attach a debugger and display the stack backtrace...
The standard way to proceed would be to attach a debugger and display the stack backtrace...
But there is no crash... Would you like me to attach debugger to openssl
, and capture the backtrace after I hit ^C
? Intuitively I wouldn't expect anything of much use from this...?
No. I want to you attach the debugger to the running application and capture the stack backtrace while it hangs.
Please let me know if you need help attaching a debugger to a running process.
Turned out that the offending module was softhsm2.module
in /opt/local/share/p11-kit/modules/
. Removing that file (and relocating it elsewhere) remedied the problem.
Now the question is - what (if anything) do we do with that module?
Please let me know if you need help attaching a debugger to a running process.
Yep, I do. Especially since on Mac OS X gdb doesn't quite work, and lldb is somewhat weird (I haven't gotten the hang of it). I haven't tried debugger on Sierra at all, suspect it would be worse than on the previous versions.
Is that opendnssec/SoftHSMv2#239 ?
I don't think so because (a) I don't experience crash, and (b) the fix has been merged 26 days ago, but I experience that hangup now (using the merged/fixed SoftHSMv2 master).
@mtrojnar perhaps you were a bit rush to label this "invalid" - we don't know yet where exactly the problem lies. We only know that it does not occur without SoftHSMv2.
@mouse07410 The stack backtrace would be helpful.
I'll be happy to provide, but I'll need some guidance (https://github.com/OpenSC/libp11/issues/123#issuecomment-254079560). It's MacOS Sierra, Xcode-8.0, current Github master levels of all the OpenSC-related software.
@mouse07410 Here is my lldb micro-tutorial:
$ sleep 1000 &
[1] 299
$ lldb -p 299
(lldb) process attach --pid 299
Process 299 stopped
Executable module set to "/bin/sleep".
Architecture set to: x86_64-apple-macosx.
(lldb) bt
* thread #1: tid = 0x0aa3, 0x00007fff88777a3a libsystem_kernel.dylib`__semwait_signal + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
* frame #0: 0x00007fff88777a3a libsystem_kernel.dylib`__semwait_signal + 10
frame #1: 0x00007fff88869dc0 libsystem_c.dylib`nanosleep + 200
frame #2: 0x000000010ba65e2a sleep`___lldb_unnamed_function1$$sleep + 381
frame #3: 0x00007fff944255fd libdyld.dylib`start + 1
frame #4: 0x00007fff944255fd libdyld.dylib`start + 1
(lldb)
Hope this is of use. One terminal window:
$ openssl dgst -engine pkcs11 -keyform engine -sign "pkcs11:manufacturer=piv_II;object=SIGN%20key;type=private" -sha384 -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -out t256.dat.sig t256.dat
engine "pkcs11" set.
PKCS#11 token PIN:
[process is hanging there]
Another window:
$ ps
PID TTY TIME CMD
1122 ttys000 0:00.86 -bash
1207 ttys001 0:03.21 -bash
37087 ttys001 0:00.03 openssl dgst -engine pkcs11 -keyform engine -sign pkcs11:manufacturer=piv_II;o
24377 ttys002 0:03.40 -bash
$ lldb -p 37087
(lldb) process attach --pid 37087
Process 37087 stopped
* thread #1: tid = 0x2f9cf, 0x00007fffe4e3bcba libsystem_kernel.dylib`__psynch_mutexwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x00007fffe4e3bcba libsystem_kernel.dylib`__psynch_mutexwait + 10
libsystem_kernel.dylib`__psynch_mutexwait:
-> 0x7fffe4e3bcba <+10>: jae 0x7fffe4e3bcc4 ; <+20>
0x7fffe4e3bcbc <+12>: movq %rax, %rdi
0x7fffe4e3bcbf <+15>: jmp 0x7fffe4e34d6f ; cerror_nocancel
0x7fffe4e3bcc4 <+20>: retq
Executable module set to "/opt/local/bin/openssl".
Architecture set to: x86_64h-apple-macosx.
(lldb) bt
* thread #1: tid = 0x2f9cf, 0x00007fffe4e3bcba libsystem_kernel.dylib`__psynch_mutexwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
* frame #0: 0x00007fffe4e3bcba libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fffe4f24f6a libsystem_pthread.dylib`_pthread_mutex_lock_wait + 100
frame #2: 0x0000000109ef48d7 p11-kit-proxy.dylib`lock_mutex + 14
frame #3: 0x000000010a2b688b libsofthsm2.dylib`Mutex::lock() + 123
frame #4: 0x0000000109d38e41 libcrypto.1.0.0.dylib`ENGINE_finish + 49
frame #5: 0x000000010a2c03c1 libsofthsm2.dylib`OSSLCryptoFactory::~OSSLCryptoFactory() + 33
frame #6: 0x000000010a2c046e libsofthsm2.dylib`OSSLCryptoFactory::~OSSLCryptoFactory() + 14
frame #7: 0x000000010a285a6e libsofthsm2.dylib`SoftHSM::C_Finalize(void*) + 158
frame #8: 0x000000010a2649e9 libsofthsm2.dylib`C_Finalize + 25
frame #9: 0x0000000109ef3b52 p11-kit-proxy.dylib`finalize_module_inlock_reentrant + 107
frame #10: 0x0000000109ef42a9 p11-kit-proxy.dylib`managed_C_Finalize + 244
frame #11: 0x0000000109f09dcc p11-kit-proxy.dylib`binding_C_Finalize + 21
frame #12: 0x0000000109f4a590 libffi.6.dylib`ffi_closure_unix64_inner + 617
frame #13: 0x0000000109f4aa96 libffi.6.dylib`ffi_closure_unix64 + 70
frame #14: 0x0000000109ef2fad p11-kit-proxy.dylib`p11_kit_modules_finalize + 56
frame #15: 0x0000000109ef6fe9 p11-kit-proxy.dylib`proxy_free + 27
frame #16: 0x0000000109ef574b p11-kit-proxy.dylib`proxy_C_Finalize + 198
frame #17: 0x0000000109f09dcc p11-kit-proxy.dylib`binding_C_Finalize + 21
frame #18: 0x0000000109f4a590 libffi.6.dylib`ffi_closure_unix64_inner + 617
frame #19: 0x0000000109f4aa96 libffi.6.dylib`ffi_closure_unix64 + 70
frame #20: 0x0000000109ed8b71 libpkcs11.dylib`pkcs11_CTX_unload + 33
frame #21: 0x0000000109ed40c0 libpkcs11.dylib`pkcs11_finish + 64
frame #22: 0x0000000109d38cfd libcrypto.1.0.0.dylib`engine_unlocked_finish + 109
frame #23: 0x0000000109d39af2 libcrypto.1.0.0.dylib`int_cleanup_cb_LHASH_DOALL + 34
frame #24: 0x0000000109d48c0a libcrypto.1.0.0.dylib`lh_doall + 74
frame #25: 0x0000000109d39a98 libcrypto.1.0.0.dylib`engine_table_cleanup + 56
frame #26: 0x0000000109d381bb libcrypto.1.0.0.dylib`engine_cleanup_cb_free + 11
frame #27: 0x0000000109d482f0 libcrypto.1.0.0.dylib`sk_pop_free + 48
frame #28: 0x0000000109d3819c libcrypto.1.0.0.dylib`ENGINE_cleanup + 28
frame #29: 0x0000000109b9f072 openssl`main + 1122
frame #30: 0x00007fffe4d0d255 libdyld.dylib`start + 1
frame #31: 0x00007fffe4d0d255 libdyld.dylib`start + 1
(lldb)
Yes, there is a deadlock in the OpenSSL engine code:
/* The API (locked) version of "finish" */
int ENGINE_finish(ENGINE *e)
{
int to_return = 1;
if (e == NULL) {
ENGINEerr(ENGINE_F_ENGINE_FINISH, ERR_R_PASSED_NULL_PARAMETER);
return 0;
}
CRYPTO_w_lock(CRYPTO_LOCK_ENGINE);
to_return = engine_unlocked_finish(e, 1);
CRYPTO_w_unlock(CRYPTO_LOCK_ENGINE);
if (!to_return) {
ENGINEerr(ENGINE_F_ENGINE_FINISH, ENGINE_R_FINISH_FAILED);
return 0;
}
return to_return;
}
when the engine interface is invoked from within an engine interface.
In our case, libsofthsm2
also uses OpenSSL engines...
I struggled with this issue a few months ago in pkcs11_init()
: https://github.com/OpenSC/libp11/compare/dc3a2d2242583535d97d19335e94b08d0731adb6...9e1147ba600c9b354f1547d4039dd789b9fa3055
A similar change may be needed in pkcs11_finish()
.
The problem is where to postpone pkcs11_finish()
. To atexit()
? Any other ideas?
:-( No idea.
Might be worth to give it a try and move pkcs11_finish()
to atexit()
...?
I've no better idea (unless you count just doing engine_unlocked_finish(e, 1);
without locking at all as one :) )
engine_unlocked_finish()
is not an exported symbol...
So let's try your first suggestion and see if it works? Would you like to push it to master?
I think I found a better solution. Give me a few hours.
$ ifrun openssl
uri 51181 0.0 0.1 2476976 4396 s001 S+ 10:07PM 0:00.03 openssl dgst -engine pkcs11 -keyform engine -sign pkcs11:manufacturer=piv_II;object=SIGN%20key;type=private -sha384 -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -out t256.dat.sig t256.dat
$ lldb -p 51181
(lldb) process attach --pid 51181
Process 51181 stopped
* thread #1: tid = 0x5a383, 0x00007fffe4e3bcba libsystem_kernel.dylib`__psynch_mutexwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x00007fffe4e3bcba libsystem_kernel.dylib`__psynch_mutexwait + 10
libsystem_kernel.dylib`__psynch_mutexwait:
-> 0x7fffe4e3bcba <+10>: jae 0x7fffe4e3bcc4 ; <+20>
0x7fffe4e3bcbc <+12>: movq %rax, %rdi
0x7fffe4e3bcbf <+15>: jmp 0x7fffe4e34d6f ; cerror_nocancel
0x7fffe4e3bcc4 <+20>: retq
Executable module set to "/opt/local/bin/openssl".
Architecture set to: x86_64h-apple-macosx.
(lldb) bt
* thread #1: tid = 0x5a383, 0x00007fffe4e3bcba libsystem_kernel.dylib`__psynch_mutexwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
* frame #0: 0x00007fffe4e3bcba libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fffe4f24f6a libsystem_pthread.dylib`_pthread_mutex_lock_wait + 100
frame #2: 0x00000001041498d7 p11-kit-proxy.dylib`lock_mutex + 14
frame #3: 0x000000010450b88b libsofthsm2.dylib`Mutex::lock() + 123
frame #4: 0x0000000103f8ce41 libcrypto.1.0.0.dylib`ENGINE_finish + 49
frame #5: 0x00000001045153c1 libsofthsm2.dylib`OSSLCryptoFactory::~OSSLCryptoFactory() + 33
frame #6: 0x000000010451546e libsofthsm2.dylib`OSSLCryptoFactory::~OSSLCryptoFactory() + 14
frame #7: 0x00000001044daa6e libsofthsm2.dylib`SoftHSM::C_Finalize(void*) + 158
frame #8: 0x00000001044b99e9 libsofthsm2.dylib`C_Finalize + 25
frame #9: 0x0000000104148b52 p11-kit-proxy.dylib`finalize_module_inlock_reentrant + 107
frame #10: 0x00000001041492a9 p11-kit-proxy.dylib`managed_C_Finalize + 244
frame #11: 0x000000010415edcc p11-kit-proxy.dylib`binding_C_Finalize + 21
frame #12: 0x000000010419f590 libffi.6.dylib`ffi_closure_unix64_inner + 617
frame #13: 0x000000010419fa96 libffi.6.dylib`ffi_closure_unix64 + 70
frame #14: 0x0000000104147fad p11-kit-proxy.dylib`p11_kit_modules_finalize + 56
frame #15: 0x000000010414bfe9 p11-kit-proxy.dylib`proxy_free + 27
frame #16: 0x000000010414a74b p11-kit-proxy.dylib`proxy_C_Finalize + 198
frame #17: 0x000000010415edcc p11-kit-proxy.dylib`binding_C_Finalize + 21
frame #18: 0x000000010419f590 libffi.6.dylib`ffi_closure_unix64_inner + 617
frame #19: 0x000000010419fa96 libffi.6.dylib`ffi_closure_unix64 + 70
frame #20: 0x000000010412db71 libpkcs11.dylib`pkcs11_CTX_unload + 33
frame #21: 0x0000000104129034 libpkcs11.dylib`pkcs11_destroy + 68
frame #22: 0x0000000104128c1d libpkcs11.dylib`engine_destroy + 109
frame #23: 0x0000000103f8c08c libcrypto.1.0.0.dylib`engine_free_util + 140
frame #24: 0x0000000103f8c877 libcrypto.1.0.0.dylib`ENGINE_remove + 199
frame #25: 0x0000000103f8cc25 libcrypto.1.0.0.dylib`engine_list_cleanup + 21
frame #26: 0x0000000103f8c1bb libcrypto.1.0.0.dylib`engine_cleanup_cb_free + 11
frame #27: 0x0000000103f9c2f0 libcrypto.1.0.0.dylib`sk_pop_free + 48
frame #28: 0x0000000103f8c19c libcrypto.1.0.0.dylib`ENGINE_cleanup + 28
frame #29: 0x0000000103df7072 openssl`main + 1122
frame #30: 0x00007fffe4d0d255 libdyld.dylib`start + 1
(lldb)
On the other hand,
$ export PKCS11_MODULE_PATH=/opt/local/lib/p11-kit
p11-kit/ p11-kit-proxy.dylib p11-kit-proxy.so
$ export PKCS11_MODULE_PATH=/opt/local/lib/p11-kit-proxy.dylib
$ openssl dgst -engine pkcs11 -keyform engine -sign "pkcs11:model=YubiKey%20NEO;manufacturer=Yubico;token=YubiKey%20PIV;id=%02;type=private" -sha384 -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -out t256.dat.sig t256.dat
engine "pkcs11" set.
PKCS#11 token PIN:
$ openssl dgst -engine pkcs11 -keyform engine -sha384 -verify "pkcs11:manufacturer=piv_II;object=SIGN%20pubkey;type=public" -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -signature t256.dat.sig t256.dat
engine "pkcs11" set.
Verified OK
$
I think this time you did it:
$ openssl dgst -engine pkcs11 -keyform engine -sign "pkcs11:model=YubiKey%20NEO;manufacturer=Yubico;token=YubiKey%20PIV;id=%02;type=private" -sha384 -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -out t256.dat.sig t256.dat
engine "pkcs11" set.
PKCS#11 token PIN:
$ openssl dgst -engine pkcs11 -keyform engine -sha384 -verify "pkcs11:manufacturer=piv_II;object=SIGN%20pubkey;type=public" -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -signature t256.dat.sig t256.dat
engine "pkcs11" set.
Verified OK
$ openssl dgst -engine pkcs11 -keyform engine -sign "pkcs11:manufacturer=piv_II;object=SIGN%20key;type=private" -sha384 -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -out t256.dat.sig t256.dat
engine "pkcs11" set.
PKCS#11 token PIN:
$ openssl dgst -engine pkcs11 -keyform engine -sha384 -verify "pkcs11:manufacturer=piv_II;object=SIGN%20pubkey;type=public" -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -signature t256.dat.sig t256.dat
engine "pkcs11" set.
Verified OK
$ echo $PKCS11_MODULE_PATH
/opt/local/lib/p11-kit-proxy.dylib
$
And without having to change p11-kit
. ;-)
@mouse07410 Would you mind testing the pkcs11 engine for this deadlock with my patch applied? https://github.com/OpenSC/libp11/pull/138
I tested it with softHSM2 on Windows 10 using patched pkcs11 engine, openssl 1.0.2j compiled by Visual Studio 2015 and command:
openssl dgst -engine pkcs11 -keyform engine -sign slot_0-id_1234567890ab -sha384 -sigopt rsa_padding_mode:pss -sigopt rsa_pss_saltlen:-1 -out c:\tmp\derive.26192.text.sig sign.dgst.bin
and it asks for PIN, then exits correctly.
Would you mind testing the pkcs11 engine for this deadlock with my patch applied?
@nased0 I just did, and your patch restores the deadlock, working with a real token.
Removing these lines
if (cpriv->handle) {
pkcs11_CTX_unload(ctx);
}
removed the deadlock.
OpenSSL 1.0.2j compiled by Xcode-8.1 on Mac OS X El Capitan 10.11.6.
I confirm that your patch works for SoftHSMv2 (which I care less about), but it does not work for physical tokens (which I care for a lot).
You mean YubiKey USB plug? That's weird. There must be something wrong with its CRYPTOKI user mode driver. I tested it with Enigma Encard and Unizeto Certum and it worked correctly. I'd like to ask you for two more tests: 1: Replace these lines by
if (cpriv->handle) { C_UnloadModule(cpriv->handle); }
2: Replace these lines by
if (cpriv->handle) { OPENSSL_free((struct sc_pkcs11_module {unsigned int _magic;void handle;} )cpriv->handle); }
There must be something wrong with its CRYPTOKI user mode driver.
I don't use CRYPTOKI. :-)
if (cpriv->handle) {
C_UnloadModule(cpriv->handle);
}
causes SEGV.
if (cpriv->handle) {
OPENSSL_free((struct sc_pkcs11_module {unsigned int _magic;void *handle;} *)cpriv->handle);
}
appears to work. I must add that the above line is an abomination, but it did not cause either crash or deadlock. :-)
Must also add that all of those attempts seemed to work fine with SoftHSMv2. So apparently SoftHSMv2 is not a very reliable test.
P.S. For what it's worth, tested it also with the current OpenSSL-1.1-stable, for the kicks. It worked.
@nased0 changing that weird line to OPENSSL_free(cpriv->handle);
appears to work, at least with OpenSSL-1.0.2j. Will try 1.1 tomorrow.
Update Yep, appears to work on the current 1.1.-stable. Congrats, no deadlock, no crash (so far :), and (hopefully :) no memory leak.
I think that there should be a configure parameter proper-ctx-unload that sets a preprocessor definition PROPER_CTX_UNLOAD. This way developers will be able to decide what is more important for them: proper module deinitialization and unloading it from memory or supporting devices with module that causes deadlock (default mode). I do not plan to support modules that cause deadlock, only a few selected PKCS#11 cards.
Suggested corrections in https://github.com/OpenSC/libp11/commit/689bd3cc852bcd3e3e1c15f42949a4a577e29dbb (in conjunction with my patch https://github.com/OpenSC/libp11/pull/139) int pkcs11_finish(ENGINE_CTX *ctx) ... if (ctx->pkcs11_ctx) { **#ifdef PROPER_CTX_UNLOAD PKCS11_CTX_unload(ctx->pkcs11_ctx);
PKCS11_CTX_free(ctx->pkcs11_ctx); ctx->pkcs11_ctx = NULL; } ...
I think that there should be a configure parameter no-ctx-unload that sets a preprocessor definition NO_CTX_UNLOAD. This way developers will be able to decide what is more important for them: proper module deinitialization and unloading it from memory (default mode) or supporting devices with module that causes deadlock
With all due respect, this does not make any sense. There is no "Yubikey module", unless you count the PIV driver from OpenSC. So your "troublesome devices" are all PIV-compliant cards.
And I thought your patch fixed memory leak? if not - what was the point of going through all that fun, trying different ways of freeing cpriv->handle
?
I do not plan to support PIV driver from OpenSC in my app. Besides, IMO it should be corrected to avoid deadlock, commenting PKCS11_CTX_unload is an ugly hack! So if you insist, I will change the previous post, to non-default mode for PKCS11_CTX_unload. Yes, my current patch fixed small memory leak in supporting data structure, but the entire module still resides in memory, unfinalized and not unloaded! It may be acceptable for your needs, but I'm developing a service application, that should not have any memory leaks or other problems that threaten its stability (like not finalizing and unloading a module)! It is not an academic problem - previously I used old version 0.2.8 of engine_pkcs11.dll that caused hangs after about 600 curl communication sessions.
@nased0 frankly I lost interest. As for me PIV is the main standard for cards, and it's the other devices (like Unizeto Certum) that in my view are "troublesome" (and these devices I don't care about).
@mtrojnar I don't know what's the actual cause of that deadlock. I seem to recall that it was related to both p11-kit
and SoftHSMv2 being present, and p11-kit
being configured to support SoftHSMv2 (that in turn seemed to use more of OpenSSL than the main opensc-pkcs11.so
, which seemed to lead to this deadlock). I'm happy with the existing solution and the existing state of libp11
code (for both 1.0.2+ and 1.1+). I certainly wouldn't welcome a change that would bring deadlock back.
@dengert do you have any opinion on this deadlock issue?
Do I have an option? I have not paid much attention to this issue, as it is one MacOS, and also uses p11-kit. I dont have MacOS. p11-kit introduces its own set of issues as it can load multiple PKCS#11 modules each of which may have their own issues and in some cases multiple PKCS#11 modules may try and support the the same smart card.
As for "There is no "Yubikey module"" well there are at least two and maybe more on MacOS. There is the opensc-pkcs11 that supports the Yubikey as PIV (or openPGP.) There is also the Yubico YKCS11 libypkcs11.so part of the yubico-piv-tool.
@nased0 when you say "I do not plan to support PIV driver from OpenSC in my app." What are you planing on supporting? Are you saying you will not use the opensc-pkcs11.so? What pkcs#11 libs are you going to use?
As for "There is no "Yubikey module"" well there are at least two and maybe more on MacOS.
Well, these are modules that can support Yubikey: :-)
opensc-pkcs11.so
that supports PIV. That automatically handles Yubikey, newer CAC, and other cards.pivtoken
- native Apple CTK-based tokend that supports PIV only, and happens to support Yubikey if it is correctly provisioned.libykcs11.so
that Yubico provided specifically for Yubikey (I think it's PIV only, but it doesn't really matter - the deadlock problem was traced to SoftHSMv2 libsofthsm2.so
). As it chose to name keys differently from opensc-pkcs11.so
, I don't really use it.As you see, only one of them is a specific Yubikey-dedicated module (and it's the least used, if at all). Others are PIV or PIV+ (PIV and other types).
@dengert Currently the Encard Enigma enigmap11.dll and Unizeto Certum dll's crypto3PKCS.dll (for common profile) and cryptoCertum3PKCS.dll (for safe profile) that implement CRYPTOKI functions are supported. https://docs.oracle.com/cd/E23823_01/html/816-4863/chapter2-10.html I plan on using pkcs11.dll and libp11-2.dll from this project. I think that I should have the option to build a proper engine without hacks, that are important only for a few users.
@nased0
So your and @mouse07410 are talking different platforms and different PKCS#11 modules.
Is your problem windows specific?
Do we need some #ifdefs for windows here?
Is this 32 or 64 bit?
I see a number of warning messages for 64 bit in latest windows 64 bit build. https://ci.appveyor.com/project/LudovicRousseau/libp11/build/0.4.0.207
Do we need some other windows testers to see if they can reproduce your problems using other PKCS#11 modules on Windows? (not me.)
My application is for 32-bit Windows (XP to 10) My problem is that line 277 in file src/eng_back.c in function pkcs11_finish(ENGINE_CTX *ctx) PKCS11_CTX_unload(ctx->pkcs11_ctx); has been commented by commit https://github.com/OpenSC/libp11/commit/689bd3cc852bcd3e3e1c15f42949a4a577e29dbb to avoid deadlock. Therefore the module is not finalized (by pkcs11_CTX_unload) and not unloaded from memory (by C_UnloadModule from libpkcs11.c), causing memory leak and potentially other problems with module stability. I don't have any problems with the presence of active line 277, so there is nothing to reproduce in this case.
@mtrojnar "workaround" is just a hack, that solved specific @mouse07410 problem. I think that this problem can be properly solved only at the source, that means opensc-pkcs11.so or other module involved in this deadlock.
I don't have any problems with the presence of active line 277,
Great for you. Unfortunately, others (including myself) do have a problem - the above-described deadlock.
so there is nothing to reproduce in this case
This whole issue/thread was about reproducing this problem. It was confirmed by others.
I must add that the small memory leak does not bother me and my applications at all - I don't tend to load/unload modules a lot.
Before submitting to opensc, The initial problem has p11-kit loading 4 PKCS#11 modules: p11-kit-trust, opensc, softhsm and ypkcs11.
@mouse07410 you will the need to show where the deadlock is. Eliminating these modules one be one would be a start. There is going to be a lot of finger pointing between libp11, p11-kit, opensc, softhsm and ypkcs11 and openssl if you don't.
I believe softhsm uses some openssl engine code, as well as libp11 The problems could between these two. A little finger pointing:-)
Since you are using a Yubico as a PIV, you could try and use just the ypkcs11 and eliminate opensc. A little more finger pointing :-)
The smaller the set of modules in a failing case the easier it will be to find your problem.
If you can reproduce your problem on a Linux system would help not point at MacOS.
@mouse07410 you will the need to show where the deadlock is. Eliminating these modules one be one would be a start. There is going to be a lot of finger pointing between libp11, p11-kit, opensc, softhsm and ypkcs11 and openssl if you don't.
It was really simple. p11-kit
includes SoftHSMv2 in its default config. When I removed SoftHSMv2 module from the p11-kit config, the deadlock disappeared. I think @mtrojnar confirmed that the deadlock was caused by SoftHSMv2's use of the openssl engine code (unlike what other PKCS#11 libraries do, which is why they didn't cause a deadlock among themselves).
Since you are using a Yubico as a PIV, you could try and use just the ypkcs11 and eliminate opensc.
No need to - it was shown that ykcs11 coexists fine with opensc-pkcs11.
If you can reproduce your problem on a Linux system would help not point at MacOS.
Understood. Unfortunately, I don't have a working Linux system any more. :-(
Not to mention that I don't know if p11-kit
exists there (it probably should, as it's a FreeBSD port, but again, I just don't know).
If p11-kit
port does exist on Linux - perhaps you could try it with opensc-pkcs11.so
and libsofthsm2.so
?
The short version is:
OpenSSL uses a global non-recursive lock (global_engine_lock
for OpenSSL 1.1.x, and CRYPTO_LOCK_ENGINE
for older versions) to protect the engine's "new"/"free" operations.
This causes a deadlock (at least on some platforms) when engine initialization is performed recursively. The PKCS#11 engine is affected, because some PKCS#11 modules use the engine interface internally. We implemented ugly workarounds for both "new" (we delay initialization until first use) and "free" (we never unload PKCS#11 modules).
IMHO the proper solution would be for the OpenSSL to only protect managing its internal data structures with the global lock, and not the actual engine invocation (e->init(e)
/e->destroy(e)
).
One simple thing we can do is to restore unloading the PKCS#11 module when OpenSSL is older than 1.1.0, and the OpenSSL's locking interface is uninitialized. A deadlock can only manifest if the application had initialized threads... What do you think? Would it help with your code?
Thanks for the clear explanation. IMO it would be best to repair this deadlock properly in OpenSSL itself. I wrote to Matt Caswell about my other problem with crashing atexit on Windows and he repaired it (https://github.com/curl/curl/issues/1055) so I think that he may help with this problem too. It is not pkcs11 engine that has deadlock after all, but the module's C_Finalize method.
One simple thing we can do is to restore unloading the PKCS#11 module when OpenSSL is older than 1.1.0, and the OpenSSL's locking interface is uninitialized
@mtrojnar I'm not so sure, because the deadlock I observed was with OpenSSL-1.0.2j - definitely older than 1.1.0.
A deadlock can only manifest if the application had initialized threads...
My simplest application was openssl
command-line tool performing digital signature using libp11
to access PKCS#11 token. Does openssl dgst
initialize threads in 1.0.2?
Does openssl dgst initialize threads in 1.0.2?
I thought it doesn't, but apparently I was wrong...
The following is more questions then solutions ...
We are dealing with Dynamically Loaded Objects/Modules. When I used to work with Solaris, I would run into issues similar to what we are seeing. Solaris had a way to link a module so that it was self contained and used the libraries it required. libtool was changed to add:
module_cmds="$archive_cmds \${wl}-B \${wl}direct \${wl}-z \${wl}defs"
This would allow each module to have its own version of a library and its own copy of the library static common data. I think with OpenSSL this would mean the "global lock" is global for the module only.
Other modules using OpenSSL would have their own "global Lock".
Does the engine code in OpenSSL make this impossible?
Other OSs have similar linker capabilities, with flags to dlopen to make sure the module was self contained.
pkcs11 modules have some of the issues as PAM modules, Apache modules or browser plugins. Are we missing something by not trying to use the linker to make modules self contained?
An example in Fedora for use with coolkey: http://www.redhat.com/archives/fedora-directory-commits/2007-August/msg00027.html
uses: module_cmds="\$CC \$allow_undefined_flag -o \$lib -bundle \$libobjs \$deplibs\$compiler_flags"
@dengert I know that MacOS includes libtool
, but the flags (and I suspect, capabilities) between Linux and MacOS linkers are not the same. :-(
$ libtool --help
error: libtool: unknown option character `-' in: --help
Usage: libtool -static [-] file [...] [-filelist listfile[,dirname]] [-arch_only arch] [-sacLT] [-no_warning_for_no_symbols]
Usage: libtool -dynamic [-] file [...] [-filelist listfile[,dirname]] [-arch_only arch] [-o output] [-install_name name] [-compatibility_version #] [-current_version #] [-seg1addr 0x#] [-segs_read_only_addr 0x#] [-segs_read_write_addr 0x#] [-seg_addr_table <filename>] [-seg_addr_table_filename <file_system_path>] [-all_load] [-noall_load]
$
Does not help. libtool is generated by autoconfig. Look at libtool script in build directory and see what it has for module_cmds= on your systems.
Does not help. libtool is generated by autoconfig.
Well, not quite. I mean - yes and no. :-)
$ type -all libtool
libtool is /opt/local/bin/libtool
libtool is /usr/bin/libtool
$
Just ran into this trying to merge in latest changes from libp11 (which I have tried previously and failed, but only now realized this was the reason). Man, this is really unfortunate.
Crux of it seems that if you're using the engine interface there is no way anymore to tickle libp11 to call C_Finalize for you. In my automated tests, I setup and teardown new softhsm instances per test run, and not being able to shut it down all the way to avoid state bleeding between them is no bueno.
I suppose I could just load the library myself and call C_Finalize on it. /shrug
Note: much of the reasoning about this change being okay seems to be that "a memory leak is not that bad". Fair enough about the module not being unloaded.
I just wanted to mention that the real downside here is that C_Finalize is not being called, and thus the entire pkcs11 module (e.g. softhsm2) is left open, which is not only itself a much bigger leak, but is also an ownership problem.
@matthauck Because of this problem I have to build pkcs11 engine for my purposes with this line uncommented: https://github.com/OpenSC/libp11/blob/master/src/eng_back.c#L358
Right. If I take in this change, then I will have to patch out that line too. I'm just not convinced this is the right solution for libp11 as a library.
@mtrojnar Are you okay with leaving this as-is going forward? Not being able to have C_Finalize get called seems unrelated to the engine issues you're referring to above. What if we changed this line to call C_Finalize instead of unloading the module as a middle ground until openssl locking is fixed on their end? (And is there even an open issue with openssl that we can track against? If not, I think you're the best person who understands this situation to open an issue)
Mac OS X. Weird situation. I suspect it is related to cleaning the appropriate PKCS11 module. Would like help proving it (and obviously, fixing it :).
Update: it hangs after the requested operation (signature in this case) has been completed, so the signature file is produced and written to disk (before the application freezes up).
It hangs up when invoked "normally":
When I invoke it with PKCSSPY, however, it completes correctly (
PKCS11SPY=/Library/OpenSC/lib/opensc-pkcs11.dylib
):The resulting signature is verifiable in both cases (when it completed normally via PKCS11SPY, and when I killed the hanging openssl process):
This is probably relevant: