Nitrokey / opcard-rs

OpenPGP card implementation
49 stars 1 forks source link

Factory Reset fails on old gpg (2.2.21) #144

Open daringer opened 1 year ago

daringer commented 1 year ago

reproducer:

docker run -it --privileged -v /dev:/dev -v /sys:/sys -v /proc:/proc debian:10 \
  bash -c "apt-get update && apt-get install -y gnupg scdaemon pcscd; \
  echo -e 'admin\\nfactory-reset\\ny\\nyes' | gpg --command-fd=0 --card-edit " 

error:

...
...
gpg: Note: This command destroys all keys stored on the card!

sending card command SELECT AID failed: Bad secret key

gpg 2.2.21 is used in debian 10 and for HEADS, so even if this is an old (outdated) version we should take a look...

sosthene-nitrokey commented 1 year ago

This does reproduce the issue on my end, however it does not crash the device, I don't need to power cycle it to get it to work again. However it indeed does not appear to be actually resetting the device. Newer version of gpg don't have this issue.

Looking at the logs coming from a dev device do not give any insight as to what could be the issue, and I don't seem to find references to such an error message elsewhere.

szszszsz commented 1 year ago

FYI: you can check GnuPG logs with this config: https://github.com/Nitrokey/gnupg-docker/blob/test-suite/scdaemon.conf

sosthene-nitrokey commented 1 year ago

From the point of view of the key it looks like it reads the keys, some more data and then stops, despite no error being returned.

Returning data for tag Tag(122)
Returning data for tag: {self:?}
Returning res: Ok(())
Received APDU Command { class: Class { cla: 0, range: Interindustry(First) }, instruction: Unknown(202), p1: 0, p2: 249, data: [], le: 256, extended: true }
Executing command GetData(Even, Tag(f9))
Returning data for tag Tag(249)
Returning data for tag: {self:?}
Returning res: Ok(())
Received APDU Command { class: Class { cla: 0, range: Interindustry(First) }, instruction: Unknown(202), p1: 1, p2: 1, data: [], le: 1024, extended: true }
Executing command GetData(Even, Tag(101))
Returning data for tag Tag(257)
Returning data for tag: {self:?}
Returning res: Ok(())
Received APDU Command { class: Class { cla: 0, range: Interindustry(First) }, instruction: Unknown(202), p1: 1, p2: 2, data: [], le: 1024, extended: true }
Executing command GetData(Even, Tag(102))
Returning data for tag Tag(258)
Returning data for tag: {self:?}
Returning res: Ok(())
Received APDU Command { class: Class { cla: 0, range: Interindustry(First) }, instruction: GenerateAsymmetricKeyPair, p1: 129, p2: 0, data: [182, 0], le: 256, extended: true }
Executing command GenerateAsymmetricKeyPair(ReadTemplate)
Returning res: Ok(())
Received APDU Command { class: Class { cla: 0, range: Interindustry(First) }, instruction: GenerateAsymmetricKeyPair, p1: 129, p2: 0, data: [184, 0], le: 256, extended: true }
Executing command GenerateAsymmetricKeyPair(ReadTemplate)
Returning res: Ok(())
Received APDU Command { class: Class { cla: 0, range: Interindustry(First) }, instruction: GenerateAsymmetricKeyPair, p1: 129, p2: 0, data: [164, 0], le: 256, extended: true }
Executing command GenerateAsymmetricKeyPair(ReadTemplate)
Returning res: Ok(())
Received APDU Command { class: Class { cla: 0, range: Interindustry(First) }, instruction: Unknown(202), p1: 0, p2: 196, data: [], le: 256, extended: true }
Executing command GetData(Even, Tag(c4))
Returning data for tag Tag(196)
Returning data for tag: {self:?}
Returning res: Ok(())
Received APDU Command { class: Class { cla: 0, range: Interindustry(First) }, instruction: Unknown(202), p1: 0, p2: 122, data: [], le: 256, extended: true }
Executing command GetData(Even, Tag(7a))
Returning data for tag Tag(122)
Returning data for tag: {self:?}
Returning res: Ok(())

The logs from scdaemon and gpg-agent are not really helpful:


2023-04-19 12:08:25 gpg-agent[576] gpg-agent (GnuPG) 2.2.12 started
2023-04-19 12:08:27 gpg-agent[576] card has S/N: D2760001240103040000A0200D160000
2023-04-19 12:08:27 gpg-agent[576] card has S/N: D2760001240103040000A0200D160000
2023-04-19 12:08:26 scdaemon[578] sending signal 12 to client 576
2023-04-19 12:08:27 scdaemon[578] sending signal 12 to client 576
2023-04-19 12:08:27 scdaemon[578] DBG: Removal of a card: 0
2023-04-19 12:08:28 scdaemon[578] apdu_send_direct failed: Bad secret key
2023-04-19 12:08:28 scdaemon[578] sending signal 12 to client 576

It works correctly on a yubikey.

szszszsz commented 1 year ago

GnuPG should show raw APDU sent - check log level. Edit: did you use the config I've linked?

sosthene-nitrokey commented 1 year ago

You're right I was missing the first line.

logs.log

sosthene-nitrokey commented 1 year ago

The only suspicious thing I see is the apdu_disconnect, but I don't really understand where it's coming from, it could come from the first SELECT call being run by the factory reset command?

robin-nitrokey commented 1 year ago
2023-04-19 12:08:28 scdaemon[578] apdu_send_direct failed: Bad secret key

I’ve been digging through the GnuPG sources trying to interpret this error:

This line originates from cmd_apdu. The string is produced by calling gpg_strerror, but apdu_send_direct returns an ISO 7816 status code that should be formatted with apdu_strerror (to do: bug report). gpg_strerror(error) uses error & 127 as the error code, and “Bad secret key” has the error code 7. So apdu_send_direct potentially returned “Bad LC” (0x6a87) or “Host busy” (0x10007).

As we don’t see a log entry from apdu_send_direct, the error must be returned before this log statement and can’t be returned here. This is the only return statement that could return a matching error code: SW_HOST_BUSY from trylock_slot.

Comparing this with the current master, one obvious difference is that apdu_send_direct calls lock_slot instead of trylock_slot, i. e. it waits until the lock can be obtained. This has be changed in f808012ac with the comment: “With trylock_slot, it may return SW_HOST_BUSY. This may occur when apdu_get_status is called by scd_update_reader_status_file.”

From the logs, it is clear that this really occurs inside a apdu_get_status call, though I can’t tell whether it is triggered by scd_update_reader_status_file.

So the next question is why this happens with the Nitrokey 3 and not the Yubikey. Can you share a log from the Yubikey test case so that we can try to identify the differences?

sosthene-nitrokey commented 1 year ago

Successful log with a yubikey: scdaemon.log

robin-nitrokey commented 1 year ago

Ah, I think I was misled by the commit comment. The problematic call is not triggered by apdu_get_status, just while apdu_get_status is called. That’s probably also why the mutex is locked. I think the relevant log snippets are:

scdaemon.log (Yubikey):

2023-04-26 07:42:51 scdaemon[577] DBG: chan_7 <- SERIALNO undefined
2023-04-26 07:42:51 scdaemon[577] DBG: apdu_open_reader: BAI=30502
2023-04-26 07:42:51 scdaemon[577] DBG: apdu_open_reader: new device=30502
2023-04-26 07:42:51 scdaemon[577] DBG: ccid-driver: using CCID reader 0 (ID=1050:0407:X:0)
...
2023-04-26 07:42:51 scdaemon[577] reader slot 0: using ccid driver
2023-04-26 07:42:51 scdaemon[577] slot 0: ATR=3B FD 13 00 00 81 31 FE 15 80 73 C0 21 C0 57 59 75 62 69 4B 65 79 40
2023-04-26 07:42:51 scdaemon[577] DBG: enter: apdu_connect: slot=0
2023-04-26 07:42:51 scdaemon[577] DBG: leave: apdu_connect => sw=0x0
2023-04-26 07:42:51 scdaemon[577] DBG: chan_7 -> S SERIALNO FF7F00
2023-04-26 07:42:51 scdaemon[577] DBG: chan_7 -> OK
2023-04-26 07:42:51 scdaemon[577] DBG: enter: apdu_get_status: slot=0 hang=0
2023-04-26 07:42:51 scdaemon[577] DBG: ccid-driver: CCID submit transfer (83): 0
2023-04-26 07:42:51 scdaemon[577] DBG: leave: apdu_get_status => sw=0x0 status=7
2023-04-26 07:42:51 scdaemon[577] sending signal 12 to client 575
2023-04-26 07:42:51 scdaemon[577] DBG: chan_7 <- APDU 00A4040006D27600012401
...
2023-04-26 07:42:51 scdaemon[577] DBG: chan_7 -> [ 44 20 90 00 ]
2023-04-26 07:42:51 scdaemon[577] DBG: chan_7 -> OK

logs.log (NK3):

2023-04-19 14:23:34 scdaemon[578] DBG: chan_7 <- SERIALNO undefined
2023-04-19 14:23:34 scdaemon[578] DBG: apdu_open_reader: BAI=31b00
2023-04-19 14:23:34 scdaemon[578] DBG: apdu_open_reader: new device=31b00
2023-04-19 14:23:34 scdaemon[578] DBG: ccid-driver: using CCID reader 0 (ID=20A0:42B2:X:0)
...
2023-04-19 14:23:34 scdaemon[578] reader slot 0: using ccid driver
2023-04-19 14:23:34 scdaemon[578] slot 0: ATR=3B 8F 01 80 5D 4E 69 74 72 6F 6B 65 79 00 00 00 00 00 6A
2023-04-19 14:23:34 scdaemon[578] DBG: enter: apdu_connect: slot=0
2023-04-19 14:23:34 scdaemon[578] DBG: leave: apdu_connect => sw=0x0
2023-04-19 14:23:34 scdaemon[578] DBG: chan_7 -> S SERIALNO FF7F00
2023-04-19 14:23:34 scdaemon[578] DBG: chan_7 -> OK
2023-04-19 14:23:34 scdaemon[578] DBG: enter: apdu_get_status: slot=0 hang=0
2023-04-19 14:23:34 scdaemon[578] DBG: chan_7 <- APDU 00A4040006D27600012401
2023-04-19 14:23:34 scdaemon[578] apdu_send_direct failed: Bad secret key
2023-04-19 14:23:34 scdaemon[578] DBG: chan_7 -> ERR 65543 Bad secret key <Unspecified source>
2023-04-19 14:23:34 scdaemon[578] DBG: leave: apdu_get_status => sw=0x0 status=7
2023-04-19 14:23:34 scdaemon[578] sending signal 12 to client 576

To me this looks like a timing issue. The SERIALNO command triggers a apdu_get_status call after it completes. For the Nitrokey 3, this call is not finished when the APDU command arrives, causing the error. Maybe we can fix this by calling gpg --card-status (and maybe adding a delay) before the reset?

sosthene-nitrokey commented 1 year ago

To me this looks like a timing issue. The SERIALNO command triggers a apdu_get_status call after it completes. For the Nitrokey 3, this call is not finished when the APDU command arrives, causing the error. Maybe we can fix this by calling gpg --card-status (and maybe adding a delay) before the reset?

Doesn't look like it helps. I tried

gpg --card-status;\
sleep 3
echo -e 'admin\\nfactory-reset\\ny\\nyes' | gpg --command-fd=0 --card-edit

and

echo -e 'list\\nadmin\\nfactory-reset\\ny\\nyes' | gpg --command-fd=0 --card-edit

without success

sosthene-nitrokey commented 1 year ago

Maybe making the request just fast enough could work? I think a lot of optimization could be done, though it's probably a bit complex.

I tried with a firmware compiled with opt-level = 3 and it still didn't work

robin-nitrokey commented 1 year ago

I’ve submitted a bug report for the misleading log message: https://dev.gnupg.org/T6476

robin-nitrokey commented 1 year ago

To sum up our recent discussion on this topic:

  1. This specific issue can be avoided by manually sending the reset APDUs to the card with gpg-connect-agent (or just upgrading to a newer GnuPG version).
  2. We should try to find out what slows down the communication and try to improve it. This is not a high-priority issue though.
daringer commented 1 year ago

This particular one-line patch fixes the issue: https://dev.gnupg.org/rG3c3765405de02b9a57fdc9a3cf901f6e3aca8586

saper commented 7 months ago

No such device after gpg2 factory reset thread on the forum suggests one user managed to kill their OpenPGP applet completely. Could this be related?

daringer commented 7 months ago

So far I understand "kill" means here that the two step procedure failed to be executed by gpg - this might be related, @sosthene-nitrokey any ideas?

sosthene-nitrokey commented 7 months ago

I responded on the forum post with information on how to fix the case where the reset procedure failed after the TERMINATE DF step.

daringer commented 7 months ago

Yes, saw this - but do you think this is a distinct issue we see there or is this related to gpg's 2.2 weird factory reset behavior ?

sosthene-nitrokey commented 7 months ago

Regarding whether this is what caused the bug, I don't think it is. From my memory this bug made the factory reset fail before doing the TERMINATE DF.