Add support for client based device reset

secworks commented 3 months ago

In order to be able to switch device applications, we need a mechanism that allows the client to perform a reset of the connected TKey device. The idea is to have the USB handler, the CH552 MCU be able to receive a command from the client and notify the FPGA trough a separate channel than the serial link (UART) that a reset is requested.

We have a couple of unused connections between the CH552 and the FPGA. The idea is to commandeer one of these for this signalling.

The firmware also needs a way of identifying when a client reset has been made. It needs this signal to know if it's going to wait for a device app from the client app or boot something else.

Open questions

Should a client reset request should always be followed, or if it should be possible to be ignored? One idea is to ignore reset requests in FW-mode.
How do we signal status or a response back to the client that we have done the reset?

secworks commented 3 months ago

Created a branch for development: https://github.com/tillitis/tillitis-key1/tree/199-add-support-for-host-based-device-reset

secworks commented 3 months ago

Allocating the interface_rts signal, since it has the same intended direction as the interface_tx UART signal. And 'r' can be interpreted as reset.

secworks commented 3 months ago

First attempt at implementing the feature has been pushed to the branch and is ready for testing. The FPGA builds and should be stable (if the CH552 keeps the interface_rts signal low). And if pulled high should trigger a FPGA reset.

secworks commented 3 months ago

I've committed an updated implementation. Basically instead of pulling the reset directly at a client reset request, we instad trigger the reset loop to again be active. This will cause the reset to be asserted as it is when the bitststream has been loaded.

mchack-work commented 3 months ago

Joachim and I discussed the API. We decided to create a first version that overrides the use of what the memory map calls TK1_SWITCH_APP, which will probably change name, into a status register with more added modes besides application mode/firmware mode:

Power cycle happened
Client reset
Internal reset

In this way firmware (And apps! We need to discuss if this is what we want.) will know what kind of reset has happened and how to proceed.

secworks commented 3 months ago

Commit https://github.com/tillitis/tillitis-key1/commit/92a419e73a3f9404d3370f19dfd7638b4d45b717 now updates the design by moving the reset handling logic to the tk1 core. It can now detect external request, and trigger the clk_reset module to perform a reset. The tk1 core can store the reset status (or better the reason why a reset was triggered). This status can be read out via the tk1 API.

Note that this commit changes the behavior of the ADDR_SWITCH_APP register. When written it will switch mode just as before. But when read, it does not return the fw_app-mode in all 32 bits. Instead the reset-status is returned in bit 17..16, and the fw_app-mode in bit 0 only. So one can not determine the fw_app mode by just checking if the API register is zero or not. In a future update we should change the name of the API register. For example to ADDR_SYS_CTRL_STATUS.

cobratbq commented 2 months ago

How can I be sure that such a (client) reset is not performed unexpectedly/maliciously?
The applications that I've seen, all check for the running firmware/application at start, then assume it is the same for the duration of the session.
I see benefits during development and when debugging, but not while being used. I can imagine an (desktop) application loading one of several programs depending on the needed function, but doesn't this assume single-request/short-running uses? (As opposed to loading a program, then performing a number of related actions, all of which assume the same program and previous data is loaded.)
What other use-cases did you have in mind?
Is this going to provide a DoS attack vector?

Given lack of persistence, sensitive data will likely be sent on every use. Now that a reset is possible, is there always the uncertainty of whether the loaded program has changed?

At the very least, it would be very valuable to allow the loaded program to disable this functionality. (Or vice-versa, enable it for development/debugging purposes.)

secworks commented 2 months ago

Great comments. The main use case for this is the ability to switch apps, sessions without having to physically pull out and reinsert the TKey. But we need to think through the implications for this. And under what situations it can be accepted.

In the PR, the tk1 core includes a reset control. The purpose is to decide if a reset request should be allowed or not. It also tracks if a reset was caused by power up the device, or if it was due to a reset request. We can extend this with allowing an application to cause a reset. And could also add ability to block, deny client reset requests.

We won't merge this functionality until we know that it provides utility, is safe and works as we want it to. You feedback is very valuable for this.

cobratbq commented 2 months ago

Here are some thoughts/questions, I still had in mind: the use-case involves loading sensitive data, then repeatedly using this data for several requests, which each provide some input of their own. The application counts on the program's continued execution and having the correct data in memory.

API-based resets (as part of TKey program-binary) are perfectly fine for me.
Client-side resets would be fine, if the program can control (disabling or explicitly enabling) client-side resets.

My thoughts are in the direction of avoiding unnecessary interference:

with the running program, or
having running program swapped, or
blocking access through repeated resets.
unpredictable faulty requests, (due to switching programs)

Probably not a convenient suggestion, possibly overkill: you could prefix response frames with an "identifier", some random (read-only) value assigned by firmware. Upon changing, one knows there was a reset (or some change of program). Granted, in most cases you would discover as soon as requests/commands no longer work, although not the root cause. (It's probably not practical. I'm just putting it out there.)

Question: would this reset mean that the connection with the USB device is lost? (USB device rediscovered? Changes to USB device?) E.g. does a desktop application maintain the (exclusive) open socket, or lose its connection? This would be important in case of switching applications with a controlled reset.

mchack-work commented 2 months ago

Some context is needed here, I think.

Consider a number of products in a future product portfolio from Tillitis:

Unchanged TKey.
A TKey with HID USB as well as the current CDC.
A TKey HID with a pre-loaded app, probably for FIDO2 authentication without client app support.

(There might be more products. I want us to publish more about our plans, to make our intentions more apparent, but we're moving carefully here.)

Number 3 here will probably start the pre-loaded app directly from flash when given power. It will be immediately usable by a user without installing any Tillitis apps on a client.

Now consider an advanced user who has TKey Number 3. They want to load the tkey-device-signer for use with our SSH agent on their TKey. How will they do it?

One solution is to do a client based device resest. When such a request comes in, probably by sending something to a special USB node, the USB microcontrolelr asks the FPGA to reset in a special way and the CPU, when starting, knows that it shouldn't load the pre-loaded app from flash. Instead it should wait for the client to send a device app, as per usual.

Makes sense?

This might be solved in some other way, perhaps, but how? And what is the best way?

Currently, for instance, we don't have any way of detecting, say, a long press on the touch sensor and somehow hijacking that from hardware to do a reset. Not as far as I know, anyway. Possible to add?

The applications that I've seen, all check for the running firmware/application at start, then assume it is the same for the duration of the session.

At least the official applications using tkey-device-signer (tkey-ssh-agent, tkey-sign) use very short sessions. They:

Connect to the TKey over USB CDC.
Check for firmware.
Check for the correct device app (not very secure, I know, more below.)
(Load device app if it's firmware.)
Do their operation.
Disconnect.

Number 2 here is problematic in many ways.

The client app uses device app specific protocol calls to find what app it is. This would probably be better if standardized across apps. We've been glancing at Protobufs and the nanopb lib for this, but still haven't even done any experiments.
The response is not secure in any way. It's just a pair of name strings and a version.

If you're only worried about speaking to the same instance of the app as you were talking to before this could probably be solved with some form of cookie sent from the client at first start. I mean: add a call to the device protocol to accept some form of secret and another call to sign something using that secret on demand. This would prove to the client that we're still the same client. Might be a good solution?

The problem on how to identify to the client that it's the correct application on first start is harder, but is internally solved by the measured CDI and thus at least not leaking key material between device apps.

mchack-work commented 2 months ago

would this reset mean that the connection with the USB device is lost? E.g. does a desktop application that has an (exclusive) open socket, lose its connection? This would be important in case of switching applications with such a controlled reset.

Probably, yes, but you bring up a good point. We probably have to do even more stuff in the MCU firmware when a write comes to the special reset endpoint.

We haven't even started on the MCU firmware. Including not just another USB endpoint for reset but also HID as well as the current CDC will be... challenging.

cobratbq commented 2 months ago

Below my view, with some additional information.

Makes sense? [..] This might be solved in some other way, perhaps, but how? And what is the best way?

Sure. I see where you're going. Taking a very high-level perspective, you have: client-side + internal resets, internal resets (from within program), or no resets.

If you want to support a use-case where you cannot have physical access (to disconnect and reconnect the device), you either:

offer client-side resets (if you want absolute control), or
offer a request-byte for asking the loaded program for the reset.

However, (2.) would be a risk, given that you don't have physical access. Assuming a sound program or physical access, then (2.) wouldn't be a problem (IMO).

Currently, [..] we don't have any way of detecting, say, a long press [..] Possible to add?

I have little experience with hardware or embedded, but I can't think of anything. IIUC you cannot detect the touch itself, only that a touch was registered.

Check for the correct device app (not very secure, I know, more below.) [..] Number 2 here is problematic in many ways. [..]

The response is not secure in any way. It's just a pair of name strings and a version.

True. For what I need, I want to be sure of the triple (device, program-binary, USS), so I perform authentication. (I use this simple custom protocol, which seems to be enough for my case.) I use app-name and -version to check the API.

If you're only worried about speaking to the same instance [..]

Nah, it's not for that. My current prototype establishes a confidential session. I'm more worried that the random gibberish that it sends gets misinterpreted and the client won't know what's going on because of a sudden reset or app-swap. The problem with resets/app-swap is that it could potentially happen at any moment, and leaves the client guessing. That's why having control over the reset-capability would be very valuable. (Or at least, I assumed that there isn't an unmistakable sign. If the connection interrupts, then confusion isn't going to be a problem.)

The problem on how to identify to the client that it's the correct application on first start is harder, [..]

My current idea assumes physical access on first use, and then TOFU. (There's a bit more to it.)

secworks commented 2 months ago

I have little experience with hardware or embedded, but I can't think of anything. IIUC you cannot detect the touch itself, only that a touch was registered.

The current touch_sensor HW core only supports detection of an event. But the sensor is actually simply presenting a high signal when a touch is performed. It is the core itself that does the event-thing. That is detect a low->high transition ,set at status register for SW to read and acknowledge before being able to detect a new event. There is nothing stopping us from providing the (sampled) input to SW, which could detect a finger present for x time.

mchack-work commented 2 months ago

Changing the core is maybe a way forward. We didn't know that the touch sensor could handle this. This might mean we don't need client reset after all, but let's keep it here and think about it.

tillitis / tillitis-key1

Add support for client based device reset #199

Open questions