qmk / qmk_firmware

Open-source keyboard firmware for Atmel AVR and Arm USB families
https://qmk.fm
GNU General Public License v2.0
17.97k stars 38.64k forks source link

[Bug] keyboard stops responding after power cycles and found where goes wrong #9962

Open LSChyi opened 4 years ago

LSChyi commented 4 years ago

I was playing with the STM32F401CCU6, which is the blackpill development board. I found that the keyboard stops responding if I use that keyboard to wake the host computer. After adding some LEDs event for debugging, I found which code section the QMK firmware sticks and make a dump hotfix to verify the observation.

Describe the Bug

After putting the host computer into sleep, if I press any key from the keyboard to wake the host up, the keyboard can not send any key event to the host computer.

I can observe the QMK firmware is running the loop in the tmk_core/protocol/chibios/main.c to wait for the USB driver to leave the USB_SUSPENDED. However, even the host computer is up, the USB driver state is still USB_SUSPENDED, which causes the keyboard sticks in this loop forever and thus not sending any new key event. I make a dumb hotfix by stoping the USB driver, then re-start the USB driver, and the USB driver becomes USB_ACTIVE again, and the keyboard is back to work.

System Information

Additional Context

The diff of my dumb hotfix:

diff --git a/tmk_core/protocol/chibios/main.c b/tmk_core/protocol/chibios/main.c
index 7d32c16ed..bdac09223 100644
--- a/tmk_core/protocol/chibios/main.c
+++ b/tmk_core/protocol/chibios/main.c
@@ -229,6 +229,7 @@ int main(void) {
                 /* Remote wakeup */
                 if (suspend_wakeup_condition()) {
                     usbWakeupHost(&USB_DRIVER);
+                    restart(&USB_DRIVER);
                 }
             }
             /* Woken up */
diff --git a/tmk_core/protocol/chibios/usb_main.c b/tmk_core/protocol/chibios/usb_main.c
index 65bd291be..d27db391a 100644
--- a/tmk_core/protocol/chibios/usb_main.c
+++ b/tmk_core/protocol/chibios/usb_main.c
@@ -559,6 +559,14 @@ void init_usb_driver(USBDriver *usbp) {
     chVTObjectInit(&keyboard_idle_timer);
 }

+void restart(USBDriver *usbp) {
+    usbStop(usbp);
+    usbDisconnectBus(usbp);
+    wait_ms(1500);
+    usbStart(usbp, &usbcfg);
+    usbConnectBus(usbp);
+}
+
 /* ---------------------------------------------------------
  *                  Keyboard functions
  * ---------------------------------------------------------
diff --git a/tmk_core/protocol/chibios/usb_main.h b/tmk_core/protocol/chibios/usb_main.h
index 94baf9b35..6083fa2bd 100644
--- a/tmk_core/protocol/chibios/usb_main.h
+++ b/tmk_core/protocol/chibios/usb_main.h
@@ -34,6 +34,7 @@

 /* Initialize the USB driver and bus */
 void init_usb_driver(USBDriver *usbp);
+void restart(USBDriver *usbp);

 /* ---------------
  * Keyboard header
drashna commented 4 years ago

@fredizzimo @tzarc @zvecr Any insights on this? I'm asking because ZSA has seen similar reports of this, and I've experienced it myself, in the past

LSChyi commented 4 years ago

From my understanding, this bug seems to be caused by the Chibios, or more preciously, is caused by usbWakeupHost call. If I wake the host with other devices, the keyboard is able to detect the USB state change and thus leave the infinite loop. Only if I wake the device with the keyboard, I can see the keyboard finishes the usbWakeupHost call, but the USB state remains the same.

So I think either this is a bug from Chibios, or it is caused by misconfiguration of the MCU. Anyhow, the workaround works good, I never had that problem for 2 days.

yulei commented 4 years ago

a dirty hack was to modify the file lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c line 546

-  if (sts & GINTSTS_WKUPINT) {
+  if (sts & GINTSTS_WKUPINT || ((usbp->state == USB_SUSPENDED) && ((sts&GINTSTS_OEPINT) || (sts&GINTSTS_IEPINT)))) {

I patched this for Matrix noah and Matrix 2.0 Additional keyboard, it works well under Windows. but I still got report that this does not work on MAC OS. From what I know, WKUPINT was send from HOST to keyboard while the HOST wakeup from some other events. Seems the chibios USB OTG driver did not correctly handling the wakeup from self (keyboard) situation.

drashna commented 4 years ago

@yulei wouldn't a better hack be to just call usb_lld_wakeup_host instead of usbWakeHost, since all that does is check the state and then wake it up?

yulei commented 4 years ago

Actually usbWakeHost was just a wrapper for usb_lld_wakeup_host. The problem was that there was no spec for the host's behavior while it resumed by a remote wakeup. Some discuss here: https://www.microchip.com/forums/m492730.aspx

What observed on windows was that the WIN10 does not RESET or RESUME the keyboard, it just send control command to endpoint 0 after it was resumed by the keyboard.

So maybe the USB driver need to quit the suspend state while it received any signals as RESET, RESUME or IN/OUT ? Maybe can dig the codes from LUFA to get an answer :) The workaround by @LSChyi was great.

LSChyi commented 4 years ago

If the workaround is ok, I can make a PR.

tzarc commented 4 years ago

I'll have a look with my mac -- I don't usually put anything to sleep so I haven't really struck anything like this so far.

drashna commented 4 years ago

Yeah, getting a number of reports of this with the Moonlander, and some with the planck ez.

Which is up to date with QMK now.

a dirty hack was to modify the file lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c

Is definitely a dirty hack. I think it's a bad idea to modify the qmk chibiOS files, as it adds additional maintenance. Any changes like this should be made upstream.

drashna commented 4 years ago

a dirty hack was to modify the file lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c line 546

-  if (sts & GINTSTS_WKUPINT) {
+  if (sts & GINTSTS_WKUPINT || ((usbp->state == USB_SUSPENDED) && ((sts&GINTSTS_OEPINT) || (sts&GINTSTS_IEPINT)))) {

I patched this for Matrix noah and Matrix 2.0 Additional keyboard, it works well under Windows. but I still got report that this does not work on MAC OS. From what I know, WKUPINT was send from HOST to keyboard while the HOST wakeup from some other events. Seems the chibios USB OTG driver did not correctly handling the wakeup from self (keyboard) situation.

Just a heads up, this doesn't help unless otg is enabled for the board, it looks like. Eg. for many of the STM32F303 boards, this won't do anything to fix the issue.

However, the solution that @LSChyi posts does seem to work for said boards.

LSChyi commented 4 years ago

Ok, I can make a PR.

tzarc commented 4 years ago

Not striking this issue with F411 at all -- how frequently is this actually happening?

LSChyi commented 4 years ago

Almost every time and is reproducible (that how I found it is caused by the USB state not changed).

tzarc commented 4 years ago

Puzzling, guess I retry with F401.

tzarc commented 4 years ago

10 attempts with F401 black pill, zero instances of lockup after sleep.

LSChyi commented 4 years ago

I test it on a mac mini machine, where when I set the computer into sleep, it takes some time to tell USB devices into a suspended state, then the problem can be reproduced. Could it be the keyboard is not yet entering a suspended state?

LSChyi commented 4 years ago

@tzarc I make a video for it, you can have a look: https://www.youtube.com/watch?v=oWEjBHfVcPw&feature=youtu.be

tzarc commented 4 years ago

Perfect -- you were on the right track with pointing out that the USB hadn't entered suspend. I modified the code to turn on the LED on C13, and only attempted wakeup after it turns on -- it's definitely locking up after that.

fauxpark commented 4 years ago

Not seeing this issue on F303 (Proton-C). I'm on macOS 10.15.6. I'm verifying that the chip is indeed getting the suspend signal with the backlight feature - although apparently the backlight_set(0) call on suspend is only implemented on AVR, so I had to add that in at the start of suspend_power_down(). Nevertheless, when I plug it in, hit the Sleep button and wait a short while, the backlight LED turns off. If I then hit a key, it lights up again, my MBP wakes up and I can immediately type.

This also doesn't seem to be an issue on Windows (10 at least) at all.

LSChyi commented 4 years ago

Weird, @drashna says that

Yeah, getting a number of reports of this with the Moonlander, and some with the planck ez.

Which is up to date with QMK now.

a dirty hack was to modify the file lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c

Is definitely a dirty hack. I think it's a bad idea to modify the qmk chibiOS files, as it adds additional maintenance. Any changes like this should be made upstream.

, as far as I know, the Planck is using F303, so it should also have that issue.

fauxpark commented 4 years ago

Video proof: https://youtu.be/jO9ceiKaKfA Unless I'm misunderstanding what the issue is?

LSChyi commented 4 years ago

@fauxpark you're right, I also tested the black pill on my MBP, and I still have that problem. Could it be the MCU misconfiguration?

fauxpark commented 4 years ago

Considering, as you point out, the Planck, Moonlander and EZ are also using the F303, they all are using the exact same *conf.h files as the Proton-C, so I'm not sure how that could be possible.

vinorodrigues commented 3 years ago

I can confirm this situation is also true on a KBDFans KBD67 Mk.II V3 with it's AVR (ATmega32U4) on macOS 10.15 (Catalina).

Returning form sleep is functioning normally if done relatively quickly after the sleep event, but if woken from sleep after a long time the keyboard is unresponsive.

kpot commented 11 months ago

Three years later, I'm having the same issue with my Cantor keyboard (based on STM32F401), connected to a PC. After a reboot or a shutdown, the keyboard would become unresponsive until I manually unplug and re-plug it back again. And I'm using the most recent master branch of the project. I tested it on two very different PCs actually and found no difference in behavior.

This patch from @yulei (thanks!) has mostly solved the issue. My keyboard is still dead after a complete shutdown, but now at least it remains functioning after a reboot of my PC.

a dirty hack was to modify the file lib/chibios/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c line 546

-  if (sts & GINTSTS_WKUPINT) {
+  if (sts & GINTSTS_WKUPINT || ((usbp->state == USB_SUSPENDED) && ((sts&GINTSTS_OEPINT) || (sts&GINTSTS_IEPINT)))) {
drashna commented 11 months ago

A lot of this has been resolved, but we're still working on it. (specifically @KarlK90 has been).

kbladewht commented 8 months ago

waiting your help for the fix , so far restart pc is still not working for me

KarlK90 commented 8 months ago

@kbladewht you can try my PR https://github.com/qmk/qmk_firmware/pull/21656 and see if this fixes you issues

kbladewht commented 8 months ago

@kbladewht you can try my PR #21656 and see if this fixes you issues

Sound likes doing root fix here, I am testing this , thanks your sharing dude !

kbladewht commented 8 months ago

the hibernate and wake up always workd, but restarting PC is still not working .

00> changed 5 ---> 2

00> reset flag 1 00> changed 2 ---> 5 00> changed 5 ---> 2 00> changed 2 ---> 5 00> changed 5 ---> 2 00> changed 2 ---> 3 00> changed 3 ---> 4

kbladewht commented 8 months ago

Above is the log I could see for 00> USB_DRIVER.state

kjkent commented 8 months ago

I'm running QMK with QMK's LUFA bootloader fork on a reflashed Yang mod HHKB (ATmega32U4). Primarily seeing this issue when booting the machine. The UEFI first complains about no keyboard detected. If I replug the kb at this point, it works in the UEFI & bootloader, but once linux loads the kb needs replugging again to be responsive. Beforehand, the initramfs kernel log on my desktop shows a few lines to do with "bad USB descriptor error 50-something" -- I can provide the actual logs and other debug info if it's of use to anyone here.

Edit:: here's the relevant dmesg output:

[    5.560254] usb 1-1.4: new full-speed USB device number 4 using xhci_hcd
[    5.703602] usb 1-1.4: device descriptor read/64, error -32
[    5.883602] usb 1-1.4: device descriptor read/64, error -32
[    6.063587] usb 1-1.4: new full-speed USB device number 5 using xhci_hcd
[    6.136934] usb 1-1.4: device descriptor read/64, error -32
[    6.334051] usb 1-1.4: device descriptor read/all, error -32
[    6.926919] usb 1-1.4: new full-speed USB device number 6 using xhci_hcd
[    6.926985] usb 1-1.4: Device not responding to setup address.
[    7.133652] usb 1-1.4: Device not responding to setup address.
[    7.340256] usb 1-1.4: device not accepting address 6, error -71
[    7.413584] usb 1-1.4: new full-speed USB device number 7 using xhci_hcd
[    7.413655] usb 1-1.4: Device not responding to setup address.
[    7.620318] usb 1-1.4: Device not responding to setup address.
[    7.826917] usb 1-1.4: device not accepting address 7, error -71

I tried @KarlK90 's PR, unfortunately it didn't make the kb work from boot. It seems a permutation of @LSChyi 's patch is now in master albeit with a variable for USB_SUSPEND_WAKEUP_DELAY. I set #define USB_SUSPEND_WAKEUP_DELAY 1500 in config.h to see if it would help, but no dice.

Edit 2: Solved (for me) -- I can now boot without having to unplug and reconnect my keyboard after adding NO_USB_STARTUP_CHECK = yes to rules.mk.

Edit 3: Unfortunately still stuck. NO_USB_STARTUP_CHECK = yes allowed the matrix to function, so I could reset the kb that way, but it still doesn't enumerate properly or send keypresses until this is done.