Open ionut-arm opened 2 years ago
With RUST_LOG=trace after re-inserting a USB HSM module:
parsec-tool:
# RUST_LOG=trace parsec-tool -p 2 create-rsa-key -k anta-11-new
[DEBUG] Parsec BasicClient created with implicit provider "Mbed Crypto provider" and authentication data "UnixPeerCredentials"
[INFO ] Creating RSA encryption key...
[DEBUG] Running getuid
[ERROR] Subcommand failed: there was a communication failure inside the implementation (ParsecClientError(Service(PsaErrorCommunicationFailure)))
parsec:
[TRACE parsec_service::front::front_end] handle_request ingress
[INFO parsec_service::front::front_end] New request received from application name "0"
[TRACE parsec_service::back::dispatcher] dispatch_request ingress
[TRACE parsec_service::back::backend_handler] execute_request ingress
[TRACE parsec_service::providers::pkcs11] psa_generate_key ingress
[ERROR parsec_service::providers::pkcs11::utils] Error converted to PsaErrorCommunicationFailure; Error: Some horrible, unrecoverable error has occurred. In the worst case, it is possible that the function only partially succeeded, and that the computer and/or token is in an inconsistent state.
[TRACE parsec_service::back::dispatcher] execute_request egress
[TRACE parsec_service::front::front_end] dispatch_request egress
[INFO parsec_service::front::front_end] Response for application name "0" sent back
[TRACE parsec] handle_request egress
From the spec:
5.1.1 Universal Cryptoki function return values
Any Cryptoki function can return any of the following values:
· CKR_GENERAL_ERROR: Some horrible, unrecoverable error has occurred. In the worst case, it is possible that the function only partially succeeded, and that the computer and/or token is in an inconsistent state.
· CKR_HOST_MEMORY: The computer that the Cryptoki library is running on has insufficient memory to perform the requested function.
· CKR_FUNCTION_FAILED: The requested function could not be performed, but detailed information about why not is not available in this error return. If the failed function uses a session, it is possible that the CK_SESSION_INFO structure that can be obtained by calling C_GetSessionInfo will hold useful information about what happened in its ulDeviceError field. In any event, although the function call failed, the situation is not necessarily totally hopeless, as it is likely to be when CKR_GENERAL_ERROR is returned. Depending on what the root cause of the error actually was, it is possible that an attempt to make the exact same function call again would succeed.
· CKR_OK: The function executed successfully. Technically, CKR_OK is not quite a “universal” return value; in particular, the legacy functions C_GetFunctionStatus and C_CancelFunction (see Section 5.15) cannot return CKR_OK.
The relative priorities of these errors are in the order listed above, e.g., if either of CKR_GENERAL_ERROR or CKR_HOST_MEMORY would be an appropriate error return, then CKR_GENERAL_ERROR should be returned.
I think it's fair to say that if we get CKR_GENERAL_ERROR
we should try to reset the connection, no matter the cause. Or bomb out (if we've already tried to reset once).
The issue
Disconnecting and reconnecting a pluggable PKCS11 token leads to the PKCS11 provider being inaccessible. To reproduce the issue:
parsec-tool
, you'll get:This is as expected.
This error is NOT expected. The service should continue to operate correctly in this case.
Solution
There are still bits of information missing which will require some more investigation. I'm hoping to get a way to reproduce this using SoftHSM2.
The ideal solution would be for us to simply re-establish a functional connection to the hardware token when we detect that the token has been unplugged and plugged back in. The actual solution will depend on how reliably we can tell whether this has happened and on what options we identify for re-establishing that connection in a clean way.
Outstanding questions
This is a variant of the more generic approach discussed in #607