trezor / trezor-firmware

:lock: Trezor Firmware Monorepo
https://trezor.io
Other
1.35k stars 654 forks source link

Report availability of data connection #2340

Closed sime closed 2 years ago

sime commented 2 years ago

During the debugging of a Trezor that is not being recognised, users may connect a power only cable to the Trezor.

Is it technical feasible that the Trezor can distinguish if something is trying to 'talk' over the wire.

Potential use case: Trezor shows a 'no data connection' icon/label when a non-data friendly cable is connected.

Proof of concept

User Story

As a Trezor user I want to know when I have connected a non data cable to my Trezor So that I know my Trezor will not work with this cable

Accetpance criteria

Given I have a powerbank When I connect my Trezor to it Then I should see a 8x8 pixel blue (#0000FF) in the top left

Given I have power only USB data cable And it is connected to a computer When I connect my Trezor to it Then I should see a 8x8 pixel blue (#0000FF) in the top left

Given I have a USB-C hub And it is connected to a power source And no computer is connected to it When I connect Trezor to the hub Then I should see a 8x8 pixel blue (#0000FF) in the top left

Wireframe

broken_cable

matejcik commented 2 years ago

it would also be nice to detect "dying" cables somehow, but I don't know how exactly they behave. is it lost packets? unfinished data? ...etc

hiviah commented 2 years ago

USB packets have CRC and retransmission (except isochronous mode for real-time like audio), so data won't seem corrupted I'd think.

Few times I've seen something was that device looked fine at first, but when you started more power intensive operation like firmware update, it got stuck

matejcik commented 2 years ago

CRC and retransmission

could we get access to, e.g., a retransmission counter?

sime commented 2 years ago

Updated the issue description with acceptance criteria for proof of concept.

hiviah commented 2 years ago

I looked over complete map of USB-related registers in SVD definition, but there doesn't seem to be any for counting retransmits.

sime commented 2 years ago

@hiviah Could you point to a build where this can be tested?

cc: @Hannsek

hiviah commented 2 years ago

@sime This jobs has the FW built:

https://gitlab.com/satoshilabs/trezor/trezor-firmware/-/jobs/2933498706 https://gitlab.com/satoshilabs/trezor/trezor-firmware/-/jobs/2933498706/artifacts/download

(somehow the jobs were hidden before with review request)

Note that we decided to change the blue dot to a warning header so it is obvious to user.

matejcik commented 2 years ago

QA instructions:

matejcik commented 2 years ago

@hiviah do we have any way for a normal user to trigger the change in this parameter? i dunno, get a powered USB hub, leave it unplugged from PC, then plug into PC?

matejcik commented 2 years ago

(fixed via #2366)

hiviah commented 2 years ago

Yes, externally powered hub works to check for change.

Exactly as you describe, unplug USB cable from computer and "NO USB CONNECTION" will appear. Replug back into computer, the warning disappears.

QA note:

TychoVrahe commented 2 years ago

I get the warning even when USB is properly connected, until suite/trezorctl tries to interact with the device. Therefore i consider the warning misleading.

Also when drawing/removing the header warning, the whole homescreen is redrawed which causes display to blink, which is annoying.

hiviah commented 2 years ago

That shouldn't happen. The warning should go away when USB is in state "configured". Even with multiple chained USB hubs (like I have) that should be much less than a second. I tried it with various devices, with and without hubs.

See USB 2.0 specification state diagram in e.g. http://sdpha2.ucsd.edu/Lab_Equip_Manuals/usb_20.pdf page 240.

Can you capture USB packets via wireshark+usbmon to see when the device enters state from "attached" to "configured" on your machine? Or some other way like dmesg/oprofile or other kernel debug why it behaves this way? Because it most likely means that your USB controller does not adhere to specification.

Can you test it also on your phone or any other computers you have?

But I know that USB implementations are buggy a lot. Whole minefield. Does it happen when your machine is on battery or have other save-powering enabled?

Does lsusb remove the warning before using Suite or trezorctl? What distro/kernel are you using?

Also when drawing/removing the header warning, the whole homescreen is redrawed which causes display to blink

That is implementation-based, event causes redraw of homescreen that happens at most 1 per second I think. This part we can fix, fixing what USB controller thinks is another can of worms.

TychoVrahe commented 2 years ago

Turns out that the device enters 'configured' state, but after few seconds it enters 'suspended' state, if there is no communication - as per 9.1.1.6 of the referred document. So there is brief flash of the warning, than few seconds no warning, and then it show the warning again until communication starts from suite. Reading 7.1.7.6 though makes me feel that the suspension shouldn't happen.

tried to modify the configured condition like this (pdev->dev_state == USBD_STATE_CONFIGURED || (pdev->dev_old_state == USBD_STATE_CONFIGURED && pdev->dev_state == USBD_STATE_SUSPENDED))

which works, but fails to detect disconnection of the data line while staying powered on.

I am on Ubuntu 20.04.4 LTS, on battery or plugged in doesn't change anything, also using usb hub or not seems without effect.

Didn't try looking with wireshark etc yet, probably won't get to it until monday.

Hannsek commented 2 years ago

@hiviah It does not work for me as you described. When I unplug it from the computer but it is still connected to the powered usb hub –> the screen remains the same. Also when plugging it in to the computer, the screen goes crazy…

But when I plug it into the power brick directly, the NO USB CONNECTION sign is there.

hiviah commented 2 years ago

@TychoVrahe that is interesting, since I have exactly the same OS (Ubuntu 20.04 LTS), but probably different hardware, so we can guess the difference is in different USB controller, because I don't get device into suspended state.

@Hannsek that is really weird, what USB hub is it exactly? I looked if they can work as "NAT" (since USB is network, technically), technically possible, but also against spec. Could you make a short video where I can see the connection TT<->hub<->PC and how "TT screen goes crazy" looks?

Hannsek commented 2 years ago

@hiviah The usb hub is this one: Verbatim USB-C multiport hub USB 3.1 GEN 1/2xUSB 3.0/HDMI/SDHC/MicroSDHC/RJ45 (https://www.czc.cz/verbatim-usb-c-multiport-hub-usb-3-1-gen-1-2xusb-3-0-hdmi-sdhc-microsdhc-rj45/286850/produkt)

  1. When I plug the Trezor into the powered hub -> it won't turn on
  2. Plug the hub to the computer with the Trezor attached -> it turns on. After starting up circle, there is a flick of the screen. The flick is not consistent, once I can see NO USB CONNECTION, the second time, the home screen just flick.
  3. (When I now unplug the power chord from the usb hub but the Trezor is still connected to the computer via the hub, the Trezor restarts itself. Same behaviour is when I again plug the power chord into the hub.)
  4. When I unplug the powered hub with the Trezor attached from the computer, the screen sometimes change to NO USB CONNECTION, sometimes it will not change and there is still PIN NOT SET.

However, it looks like that the more I do it, the more the Trezor behaves as expected…

I took the videos with my mobile phone, everything looks nice there. But on the computer, it looks bad. I have no idea why. But I thing it is understandable.

https://user-images.githubusercontent.com/57008159/187017469-ec0f9043-1129-423f-b450-de714b54d8b5.mov

https://user-images.githubusercontent.com/57008159/187017490-2556bd64-f75f-46fa-932f-96b1857d12fc.mov

https://user-images.githubusercontent.com/57008159/187017548-cb1c584f-6cbf-4e40-b69e-8d253f0f4047.mov

hiviah commented 2 years ago

USB be wild. There exists a great tool for debugging this, unfortunately even if ordered today, we'd get it in December soonest - https://www.crowdsupply.com/great-scott-gadgets/luna (though we might buy it because USB debugging comes around periodically). I have GreatFET, but it doesn't support HighSpeed USB we use both in T1 and TT.

I think I might have the same (or at least very similar) hub like you mention stashed somewhere, I will try it with that.

The periodic refresh of screen shouldn't happen on "well-behaved" hub since it means it's changing device's USB state all the time.

bosomt commented 2 years ago

QA OK

if you remember any combination that was not tested and can/needs to be tested please ping me detail of test results: https://www.notion.so/satoshilabs/Report-availability-of-data-connection-2340-d22c407726f1426bb5b09780384ccbfc

Info:

TychoVrahe commented 2 years ago

On my PC, i found that the device enters suspended mode after 2s, which is if configured in this file: /sys/bus/usb/devices/3-4.1/power/autosuspend_delay_ms with value 2000. The 3-4.1 is some device id which can probably be different on other PCs. Setting this to -1 (as per https://www.kernel.org/doc/Documentation/usb/power-management.txt ) while the device is connected disables the suspension and the warning disappears. However after re-plugging the device it is again reset to 2000, which is some kind of default value. I don't know how ubuntu works with USB so i don't know where this settings originates and whether we (can) have some control over

hiviah commented 2 years ago

My notebook doesn't seem to support autosuspend even though supports_autosuspend shows 1.

Attempt to change autosuspend_delay_ms results in error.

Global setting for usbcore autosuspend is in /sys/module/usbcore/parameters/autosuspend and can be changed with options usbcore autosuspend=-1 in configuration file in /etc/modprobe.d, though docs says this might not work if usbcore is loaded from initramfs (requires rebuild of initramfs then after adding a param to grub kernel options).

A workaround for the initramfs complication is using rc.local to set the parameter by adding echo -1 >/sys/module/usbcore/parameters/autosuspend.

matejcik commented 2 years ago

of course this doesn't help us because the 2000ms autosuspend is Linux kernel default for systems that support it (which I expect to be a lot)

hiviah commented 2 years ago

Though it's weird, since I have relatively high end laptop (the highest Thinkpad L14 intel model) and it does not support USB autosuspend, although the hubs in sysfs say they do (?)

Not sure though what % of machines do support it then. Because there is also protocol to turn off some devices on USB hub, but that is also very rare to be implemented right. For example my hub instead of powering off seems to set it to suspend instead.

If I use uhubctl on TT, instead of power off it suspends it which is shown as the "NO USB CONNECTION".

Judging though based on number in search results of people asking how to turn off autosuspend it's probably common.

TychoVrahe commented 2 years ago

Given that:

I would say that the workaround with checking the previous state seems quite reasonable. We won't be able to detect broken cables that connect/disconnect randomly, but we will detect cables with no data connection, which is still quite useful.

hiviah commented 2 years ago

@TychoVrahe I agree, the workaround with checking previous state is the best option.

We generally want people to notice they plugged TT into charger or used cable without data pins right away which was original motivation.

I was thinking that we maybe could do periodic URB interrupt to prevent suspend (which I think TT does not really handle), but let's keep things simple, that would be overengineering.

bosomt commented 2 years ago

During current dev master testing we noticed that TT flashes multiple times after starting up. Please see video, looks like related to this change.

EDIT: definitely related to USB enabling/checking: it wont happen then device is PIN protected, display flickers only after you unlock it with PIN code.

https://user-images.githubusercontent.com/31506317/191268956-1d38518c-e539-4483-8e3f-8cd1f1b1af5f.mp4

Info:

hiviah commented 2 years ago

This happens if computer USB negotiation is slower than redraw of screen.

you may see the warning blink for a second after unlock/power on when connected to PC, because it takes a bit time for TT and computer to negotiate data USB connection

We could ask matejcik if it's possible to add extra timer so the check is delayed. Some reasonable value would be I'd say 3-5 seconds before the check is made.

TychoVrahe commented 2 years ago

i actually had something ready to mitigate this, made a PR: https://github.com/trezor/trezor-firmware/pull/2523

bosomt commented 2 years ago

QA OK

bosomt commented 2 years ago

Just little nitpick , i just noted that when you try to lock device by holding finger in middle of display but you will interrupt locking display is never redrawn and Hold to lock is visible below NO USB CONNECTION banner.

image

TychoVrahe commented 2 years ago

pushed a fix for that: https://github.com/trezor/trezor-firmware/pull/2523/commits/9cec8f057a28afb2f6d432cc90c0a586833b6670

bosomt commented 2 years ago

@TychoVrahe QA OK for fix in