Nitrokey / libnitrokey

Communicate with Nitrokey devices in a clean and easy manner
https://nitrokey.com/
GNU Lesser General Public License v3.0
65 stars 34 forks source link

Nitrokey hangs when switching between libnitrokey and GnuPG #137

Open robinkrahl opened 5 years ago

robinkrahl commented 5 years ago

When switching between libnitrokey and GnuPG, both my Nitrokey Pro and my Nitrokey Storage hang with the red LED staying on. Any GnuPG calls will hang or return an error message. Further libnitrokey calls fail too.

To reproduce the error, encrypt any file test.txt for the key stored on the Nitrokey. Connect the Nitrokey. Execute these commands:

$ gpg --decrypt test.txt.gpg
$ ./test
$ gpg --decrypt test.txt.gpg

The second gpg command fails. test is compiled from this C code:

#include <assert.h>
#include <libnitrokey/NK_C_API.h>

int main(void)
{
    assert(NK_login_auto() == 1);
    assert(NK_user_authenticate("123456", "temppw") == 0);
    assert(NK_get_totp_code_PIN(0, 0, 0, 0, "temppw") != 0);
    assert(NK_logout() == 0);
    return 0;
}

Error messages when calling GnuPG or libnitrokey again:

$ gpg --decrypt test.txt.gpg
gpg: encrypted with 2048-bit RSA key, ID 22F0BE10FDEE7057, created 2019-01-05
      "Demo Application <demo-application@ireas.org>"
gpg: public key decryption failed: Missing item in object
gpg: decryption failed: No secret key
$ ./test
[Sat Jan  5 19:09:27 2019][DEBUG]       -------------------
[Sat Jan  5 19:09:27 2019][DEBUG]       Outgoing HID packet:
[Sat Jan  5 19:09:27 2019][DEBUG]       Contents:
Command ID:     USER_AUTHENTICATE
CRC:    f1256af8
Payload:
 card_password: ***********
temporary_password:
74 65 6d 70 70 77 00 00 00 00 00 00 00 00 00 00   temppw..........
00 00 00 00 00 00 00 00 00 -- -- -- -- -- -- --   .........

[Sat Jan  5 19:09:27 2019][DEBUG_L1]    => USER_AUTHENTICATE
.
[Sat Jan  5 19:09:27 2019][DEBUG_L1]    <= USER_AUTHENTICATE 0 0
[Sat Jan  5 19:09:27 2019][DEBUG]       Incoming HID packet:
[Sat Jan  5 19:09:27 2019][DEBUG]       Device status:  0 OK
Command ID:     USER_AUTHENTICATE hex: e
Last command CRC:       f1256af8
Last command status:    4 STICK10::COMMAND_STATUS::WRONG_PASSWORD
CRC:    aeb1df9d
Payload:
Empty Payload.
[Sat Jan  5 19:09:27 2019][DEBUG_L1]    Throw: CommandFailedException
[Sat Jan  5 19:09:27 2019][DEBUG]       CommandFailedException, status: 4
test: test.c:8: main: Assertion `NK_user_authenticate("123456", "temppw") == 0' failed.
Aborted

The problem can be fixed by removing and reconnecting the Nitrokey device. Are you aware of this issue? Am I doing something wrong when calling libnitrokey? Can this problem be avoided?

d-e-s-o commented 5 years ago

Thanks for filing this issue, Robin. I've experience that myself, but never produced a minimal example.

Am I doing something wrong when calling libnitrokey?

I honestly doubt that. Back in the day when nitrocli was using hidapi directly I already noticed this problem (but I gave the Nitrokey the benefit of the doubt and assumed it was my code). Would be good to get to the root of it, but I somehow suspect that root may be located somewhere in the firmware code.

szszszsz commented 5 years ago

Hi! I will have to check the firmware code yet, but I believe this is a result of the smart card lock via the CCID interface. Device cannot access the smart card locally (due to external usage lock), hence it returns the wrong password error (perhaps the error should be more meaningful). Smart card cannot be used in parallel by both routes, since it does not handle multiple contexts per design. Lock timeout is set to 60 seconds AFAIR. I plan to briefly execute your test case with this delay accounted, and see the result.

Edit: Perhaps there is an error somewhere in handling this lock, since further calls are failing as well.

szszszsz commented 5 years ago

cc @NKelias

robinkrahl commented 5 years ago

I tried executing the commands with at least 60 seconds delay, but the problem occurred nevertheless. By the way, the page Programming the Nitrokey only provides information for the Pro, Start and HSM. Are there any resources on debugging the Storage firmware? Would you just have to add an SD card to the Pro setup, or do you need a different development board?

szszszsz commented 5 years ago

I have confirmed the issue on Storage v0.53, with code from branch 137-stick_hangs. This might be related to (or the same as) https://github.com/Nitrokey/nitrokey-storage-firmware/issues/66. Environment: Fedora 29, GnuPG 2.2.11, Storage v0.53, OpenPGP card 2.1.

Edit: tested with delays: {1, 90}

szszszsz commented 5 years ago

@robinkrahl For debugging Nitrokey Storage you would need a development board, which we do not sell. It's hardware layout is available in the proper hardware repository. Please contact Jan in that matter. cc @jans23

szszszsz commented 5 years ago

It does not reproduce reproduces on Pro v0.7, OPC 2.1 (Fedora 29, GnuPG 2.2.11), in subsequent runs. The frequency for Pro is high, but not always (as it is in Storage case). Tested Pro v0.10, OPC 3.3 as well. pcscd is not executed.

Edit: add run log: 137-run.log

szszszsz commented 5 years ago

Crosslinked to https://github.com/Nitrokey/nitrokey-pro-firmware/issues/54

jans23 commented 5 years ago

The larger topic of parallel accesses is non-trivial. If you @robinkrahl would like to dig deeper and need a development board, please let me know.

robinkrahl commented 5 years ago

@jans23 Thanks! If I’m going to play with the firmware, I’ll start simple and have a look at the Pro. I was just curious because the homepage didn’t say anything about the Storage.

robinkrahl commented 5 years ago

Are there any news on this issue? It seems like every subsequent gnupg call fails after generating an OTP, no matter how long the delay, which is really annoying in daily usage.

szszszsz commented 5 years ago

@robinkrahl Sorry, but nothing new yet. @nkelias Could you take a brief look?

NKelias commented 5 years ago

It seems to me that the issue is on the GPG side of things, since it assumes a persistent session. This does not apply here, however, because the state has been altered in the background through the libnitrokey call.

As a workaround, the following works for me:

$ gpg --decrypt test.txt.gpg
$ ./test
$ gpgconf --kill gpg-agent
$ gpg --decrypt test.txt.gpg

This makes GPG drop its assumed current session state and start a new one. I'd have to dig deeper into the protocol on how to prevent the error state you're seeing.

NKelias commented 5 years ago

I tried executing the commands with at least 60 seconds delay, but the problem occurred nevertheless. By the way, the page Programming the Nitrokey only provides information for the Pro, Start and HSM. Are there any resources on debugging the Storage firmware? Would you just have to add an SD card to the Pro setup, or do you need a different development board?

For some very rudimentary printout-debugging, you could also use this bit: [1]

#define FILEIO_DEBUG_FILE "debug.txt"

u8 WriteStrToDebugFile (u8 *String_pu8)
{
    SD_UncryptedFileIO_Init_u8 ();
    FileIO_AppendText_u8((u8 *) FILEIO_DEBUG_FILE, String_pu8);
    file_flush();
    return (TRUE);
}

This allows you to write text output from the firmware to a file on the SD card. Though in general as mentioned above, a DevBoard and a debugger are the recommended way of doing it.

[1] https://github.com/NKelias/nitrokey-storage-firmware/blob/d77891292322e02009e99918f4d2c9f0ccce91d0/src/USER_INTERFACE/file_io.c#L379-L388

d-e-s-o commented 4 years ago

It's been a year since the last update. Where are we with this issue?

It seems rather unfavorable for your product to have such a blatant incompatibility with one of the arguably most widely spread privacy-related software in circulation.

szszszsz commented 4 years ago

@d-e-s-o Indeed, it takes too long with it. Thanks for bumping this up. The problem lies within the parallel smart card access from both internal and external sources AFAIR. I will keep in mind to allocate some resources to that this month.

d-e-s-o commented 4 years ago

It seems like every subsequent gnupg call fails after generating an OTP, no matter how long the delay, which is really annoying in daily usage.

Is this issue understood? Has it been identified as indeed falling under this bug's umbrella or is that a separate problem?

d-e-s-o commented 3 years ago

My suspicion is that we may have to add support for Nitrokey devices to the application switching code in scdaemon. If I am understanding correctly, switching of that sort is covered by the ISO standard for smart cards. The Geldkarte certainly managed to do that. That, or perhaps there is some generic path (that I missed) that would do that already but the Nitrokey doesn't support the operation (I played around with sending an UNLOCK to it and if I remember correctly and it didn't work properly, though I don't recall details anymore).

karthanistyr commented 3 years ago

@d-e-s-o @szszszsz Hi, maybe this is a naive comment, but Thunderbird 78 interacts with GnuPG as a fallback encryption backend, and does it by way of GPGME https://www.gnupg.org/software/gpgme/index.html . Maintainers of GnuPG seem to encourage applications to interact via GPGME.

Could this be a path forward? I have not delved much deeper than a short internet browsing, and maybe GPGME is too limited with the level of service that is expected from libnitrokey.

d-e-s-o commented 3 years ago

I doubt it. GPGME just interfaces with GnuPG from what I understand ("Currently it uses GnuPG's OpenPGP backend as the default, but the API isn't restricted to this engine."). Given that the issue at hand is an incompatibility (for lack of a better term) between libnitrokey's way of accessing a smart card and GnuPG's, I don't see anything gained. If you are suggesting that libnitrokey use GPGME, I think the kind of functionality needed is completely different.

richard-sr commented 2 years ago

Environment: Archlinux x64 with Linux Kernel 5.10.71-1-lts, GnuPG 2.2.29, Nitrokey-App 1.4.2 and Nitrokey Pro 2 Firmware 0.11.

I just got started with the Nitrokey and ran into this issue when working with password manager via Nitrokey-app and GPG decryption. Running "gpgconf --kill gpg-agent" before and after running nitrokey-app while trying not to trigger anything with GPG seems to avoid a lock-up of the system.

I think the provided error message "Device is locked/had timeout. Re-connect device..." makes users think that something unexpected happened although this is a systematic issue. The issue will immediately become active just after reconnecting if the user continues to use the app with GPG in parallel. Instead, the user should be properly instructed that the user cannot use GPG and any nitrokey-app at the same time and has to apply a good workaround that is required by their particular system. The re-connection is probably the most generic workaround, but it is rather inadequate. I think everyone who uses the Nitrokey as intended will run into this issue and given the age of the issue the information flow is quite unsatisfactory.

tlaurion commented 4 months ago

@szszszsz @sosthene-nitrokey ping? https://github.com/Nitrokey/nitrokey-pro-firmware/issues/54 https://github.com/Nitrokey/heads/issues/48

robin-nitrokey commented 4 months ago

AFAIS https://github.com/Nitrokey/heads/issues/48 is about the Nitrokey 3, but libnitrokey is only used with Nitrokey Pro or Storage. Can you add more details please?

tlaurion commented 4 months ago

I will do another rounds of tests on real hardware with heads outputting more debug around the gpg/hotp-verification related calls.

The ping was based on the behavior observed in those issues, and directly on notes under seal-hotp script workarounds pointing to those unresolved issues either firmware/library/gpg based. If either of those tools expect exclusive access, the behavior will stay. This would mean for Heads to have to kill the culprit expecting exclusive access, which to say the least is a problem.

robin-nitrokey commented 4 months ago

The particular problem in this issue was caused by the fact that the Nitrokey Pro and Storage internally used the OpenPGP smartcard to implement the (H)OTP feature. As GnuPG caches some information about the state of the smartcard, this could lead to problems. This is no longer the case for the Nitrokey 3, so this kind of problem should no longer occur.

There still is the problem that GnuPG requires exclusive access to the CCID smartcard. But this should not cause any conflicts with libnitrokey as it does not use CCID. In case any other library or tool wants to access the CCID smartcard via pcsc, scdaemon probably needs to be stopped or restarted before that.

tlaurion commented 4 months ago

AFAIS https://github.com/Nitrokey/heads/issues/48 is about the Nitrokey 3, but libnitrokey is only used with Nitrokey Pro or Storage. Can you add more details please?

@robin-nitrokey also, we do not talk specifically of the nk3 here, maybe the issue tilte is misleading. Heads abstracts as much as possible if the USB dongle is a Librem Key/Nitrokey Pro, Nitrokey Strorage or NK3. If differences, Heads expects the firmware/hotp-verification provided tools to do the right thing here. After all, Heads is a downstream user of those tools and in any case, if those issues exist, it's a problem.

The user should not be expected to have to remove/reinsert the dongle between signing and HOTP interactions with the dongle. In my use case, signing commits and then launching qemu to test such code most always resulted in the dongle not properly behaving on HOTP check, and digging down and fixing timings under Heads driver loading didn't fixed those other issues. There is something wrong happening here, not sure which tool to blame, or the firmware needing to remove some locks after said operation, or issuing a reset or whatever more clean solution.

Worst case scenario, Heads will need to ask the user to remove the USB dongle after an operation, which I hope won't be the solution here. This won't resolve development cycle operations.

As you may know, development cycles for Heads under QubesOS requires passing the usb dongle from sys-usb to testing qube for qemu to take ownership of that device and do there hotp and signing operations. This always go wrong after signing commits under the "host", that host being testing qube launching qemu after a singing operation. Exclusive access of the dongle by the host? Under qemu launching built rom with signed code after a signing op, the device is always detected properly (usb ID detected, but talking to the device always results in kernel error - 32). And here starts a dance of removing the dongle, passing it again to the qube, and timing issues makes the device mostly always unavailable, requiring qube restart, sys-usb restart and sometimes rebooting the development machine altogether with pretty low success rate.

Those opened issues made me realize that neither qubesos, sys-usb, qubes-usb-proxy, qemu nor scdaemon were maybe to to blame here. If hotp/gpg expects an exclusive access or the firmware is locked into doing one or the other type of operation, we still have a problem.

Would it be for Heads qemu launching code to kill scdeamon and whole gpg toolstack on the host prior of launching qemu? Preliminary tests shows this doesn't fix the problem.

So the question is: where is stemming the problem. Those issues still being opened (unfixed) seems to show some shared root causes.

If needed, I can open another issue. Will test other hypothesis and add debug traces in both oem-factory-reset and around all gpg/HOTP related codepaths under Heads meanwhile.

The issue also shows on non-development cycles. The dongle is not always in a state permitting to interact with on real hardware, reducing reproduction complexity.

So I'll add debug code and report what is happening from debug traces collected on real hardware so we can't blame qubesos nor qemu here.

tlaurion commented 4 months ago

The particular problem in this issue was caused by the fact that the Nitrokey Pro and Storage internally used the OpenPGP smartcard to implement the (H)OTP feature. As GnuPG caches some information about the state of the smartcard, this could lead to problems. This is no longer the case for the Nitrokey 3, so this kind of problem should no longer occur.

There still is the problem that GnuPG requires exclusive access to the CCID smartcard. But this should not cause any conflicts with libnitrokey as it does not use CCID. In case any other library or tool wants to access the CCID smartcard via pcsc, scdaemon probably needs to be stopped or restarted before that.

@robin-nitrokey Thanks for clarifying. As shipped devices are now LibremKeys (Nitrokey Pro2 equivalent) and NK3 nowadays with Nitrokey Storge not being sold anymore and firmware nor being maintained, I'll leave troubleshooting and supporting legacy devices (and the ones in the field for testing further) to your discretion and decide if this needs further fixingor more adaptation under Heads, if needed per user bug reports.

@JonathonHall-Purism tagging you here since Librem Keys are definitely still affected by this and should probably be tested more to see if still relevant after https://github.com/linuxboot/heads/pull/1638 being merged being enough or not.