kiibohd / controller

Kiibohd Controller
GNU General Public License v3.0
806 stars 270 forks source link

Infinity firmware resets to flash mode on wake up from sleep (OSX) #17

Closed rossipedia closed 9 years ago

rossipedia commented 9 years ago

Specs:

Late 2013 Retina MacBook Pro 13" OSX Yosemite 10.10.2

Whenever my laptop goes to sleep, the Infinity keyboard will not wake the laptop, and I have to use the internal keyboard or trackpad wake it up. Once the computer has woken up, the Infinity board is unresponsive and eventually the orange flash-mode LED lights up and I have to unplug and re-plug in the keyboard to get it working again.

Not sure what else I can provide to help, but I'll be happy to provide some debug output if necessary.

smasher816 commented 9 years ago

Can confirm. I plugged my infinite into my macbook pro 2 w/ yosemite, closed the lid for a while, opened it up, keyboard would not type and the orange light came on shortly later.

However, when I tried this again by letting the screen time out I was still able to wake the computer with the infinity and type like normal.

Anyways, there is some issue that is causing the chip to hang, forcing it to reboot into the bootloader.

haata commented 9 years ago

Yes, if the microcontroller ever goes into a locked up state I've set it so it will automatically go to the bootloader. The most common case of this is loading a corrupted file into the flash.

This problem may take a bit longer to fix because it's likely to do with the USB suspend/sleep features. (Anything that touches USB is a bit of a pain, because there are likely OS dependent issues)

haata commented 9 years ago

Interesting data point. Apparently something similar affects some Mac branded keyboards as well... Still, this problem sounds completely fixable (just a bit of a pain to debug).

mcm commented 9 years ago

I can confirm this issue as well. Is there any information we can collect for you to help with debugging?

haata commented 9 years ago

So, this is really hard to debug. Basically, with the current infinity keyboard it's not possible. Probably the most terrible part is that Mac OSX doesn't have usb wireshark support... (maybe someone good with VMs can get a Mac OSX vm on a Linux system so wireshark can be used).

It requires getting an McHCK, flashing with the usb/uart debug mux enabled, then listening on a uart port. Another option is for those with Teensy 3s or 3.1s could likely do the same thing.

If anyone with any of these dev platforms is up for some compiling and looking at logs, I can provide some instructions.

rossipedia commented 9 years ago

I'm not sure I understand why it's impossible? None of my ErgoDoxes had this problem, and they were based on the Teensy 2.0. Surely if it's possible on the 2.0, it should be possible on the 3.0?

Would looking at the ErgoDox firmware be of any help?

This unfortunately makes the not viable for my main use case as a mech board for my laptop.

haata commented 9 years ago

The USB stack on arm is quite different than what's on the Teensy 2.0s (AVR vs. ARM architecture).

I've also had to fix quite a few bugs in the pjrc ARM usb implementation (so I may have inadvertently created another issue). Reproducing the issue isn't really the problem, it's getting the sort of debug info required to isolate a fix (you can't rely on usb because that's where the problem is).

(This will get fixed, but I can't commit to timelines :P)

tmk commented 9 years ago

From

Once the computer has woken up, the Infinity board is unresponsive and eventually the orange flash-mode LED lights up

and

closed the lid for a while, opened it up, keyboard would not type and the orange light came on shortly later.

it seems to fail to handle resume event somehow and currently almost do nothing about suspend and resume. At least we have to ignore those events and reintialize MCU USB module like the way Mac doesn't mind. I think we can refer Freescale USB library or mbed.org code.

if the microcontroller ever goes into a locked up state I've set it so it will automatically go to the bootloader.

I skimed codes but I don't know how the keyboard goes bootloader mode once it is lockup'ed. Looks like it goes into this loop when unhandled interrupts or faults. But I couldn't find code that kicks up booloader.

https://github.com/kiibohd/controller/blob/9e3d3aaca42c15edd3701334e6d6a3ffed41c463/Lib/mk20dx.c#L63-L74

haata commented 9 years ago

This is the code that checks for the jump to bootloader (both the bootloader and firmware use this piece of code): https://github.com/kiibohd/controller/blob/9e3d3aaca42c15edd3701334e6d6a3ffed41c463/Lib/mk20dx.c#L461-L475

I think, what may be happening is Mac OSX may be sending a signal of some sort that it's going to re-initialize the keyboard. Or it tries to re-initialize the keyboard, but fails. Perhaps, the keyboard just had it's power cut (low power mode) and locks up, I haven't tested this code much yet

Then, when the usb is fully re-initialized it was already in the locked up state, so it jumps to the bootloader. Jumping to the bootloader is what I want to happen when the keyboard gets into an odd state (for example, loading corrupted firmware).

I strongly suspect the offending code is in this file: https://github.com/kiibohd/controller/blob/9e3d3aaca42c15edd3701334e6d6a3ffed41c463/Output/pjrcUSB/arm/usb_dev.c

tmk commented 9 years ago

Ah, I didn't know 'lockup' state of ARM.

The processor enters a lockup state if a fault occurs when executing the NMI or HardFault handlers. When the processor is in lockup state it does not execute any instructions. The processor remains in lockup state until either:

  • it is reset
  • an NMI occurs
  • it is halted by a debugger.

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0553a/BGBFCHGC.html

I didn't find description in datasheet though, in Kinetis Lockup state causes reset probably. As you said Lockup will be primary culprit. Problem is how Lockup state happens now.

To start bootloder it needs to meet one of following conditions. Others than Lockup are unlikely.

    if ( RCM_SRS0 & 0x40 || RCM_SRS0 & 0x20 || RCM_SRS1 & 0x02 || _app_rom == 0xffffffff ||
      memcmp( (uint8_t*)&VBAT, sys_reset_to_loader_magic, sizeof(sys_reset_to_loader_magic) ) == 0 ) // Check for soft reload

https://github.com/kiibohd/controller/blob/9e3d3aaca42c15edd3701334e6d6a3ffed41c463/Lib/mk20dx.c#L471-L472

Maybe, this is a workaround? it is against the author's intention, though :D

This code ignore 'Lockup' reset.

    if ( RCM_SRS0 & 0x40 || RCM_SRS0 & 0x20 || _app_rom == 0xffffffff ||
      memcmp( (uint8_t*)&VBAT, sys_reset_to_loader_magic, sizeof(sys_reset_to_loader_magic) ) == 0 ) // Check for soft reload
haata commented 9 years ago

:P

That code runs in the bootloader so it's a bit hard to update for 99% of users. It also prevents someone from bricking their keyboard (which was possible to do on the version I gave you tmk/hasu).

But yeah, that'd be interesting to see if it fixes the issue. I'm sure there are more elegant ways though.

smasher816 commented 9 years ago

In this situation the reset button is not being pressed, the watch dog should have already been disabled at the start of the firmware, the firmware isn't corrupted, and the magic flag hasn't been set via the command line - all that leaves is the lockup code.

If I had a bus pirate on me I would try out Hasu's bootloader patch with my mac. It would atleast confirm whether or not the lockup code is the culprit. It also would be a bandage for the issue as the keyboard would quickly reboot again and be in a valid state, instead of being stuck in the bootloader mode.

Another question for me is whether most lockups are due to random firmware bugs that a reset can fix, or if they are also from permanent issues. If they also occur from a bad flash like Haata mentions then it definitely keep its current action of going to the bootloader.


Regarding the cause of the lockup, I was also leaning towards a low power or suspend issue. If the screen turns off, the keyboard will still work for a little bit and wake the computer as it probably hasn't throttled down into deep sleep. Waiting longer the keyboard will no longer wake the keyboard. At this point keys on the keyboard can be mashed and nothing noticeable will happen. It is once the laptop is waked that the orange led will turn on after a few seconds. This happens regardless of whether or not keys were pressed during the suspend. I believe the keyboard should keep working while suspended and thus must already be hanging. Once the keyboard is waked it was already hanging putting it into the bootloader. This puts the issue in the on->suspend transition not the suspend->wake one.

At first when I was playing with my multimeter it seemed like the power dropped for a second as the screen turned off but I think that was just me being unstable with my probes as I can't seem to reproduce it consistently. Measuring in the state when the keyboard no longer responds, it is still being supplied the 5V. I am not sure if the current is reduced low enough that the chip can no longer function as I can't find a way to wire it in series with the usb cable. I am not sure if backlit keyboards stay on during sleep, but if they do then that is probably not the issue as they are still being supplied with current. If they do turn off then it might be due to a lack of current in which case we need to handle the reset more gracefully.


Finally, for debugging, it appears there are some tools for usb issues with mac's detailed in there usb development QA (https://developer.apple.com/library/mac/qa/qa1370/_index.html). These include IOUSBFamily and Usb Prober in the mac dev tools (https://developer.apple.com/downloads/index.action?q=IOUSB), which could be useful for us as they display status messages. There doesn't appear to be a way to sniff the packets without external hardware which can be pricey. The QA mentions that an analyzer is sometimes needed even with software debugging for certain protocol errors, although I am not sure if linux is better in that regard.

Has anyone tested if Windows or Linux laptops display similar issues when suspending? It would help to determine if it is an issue in the firmware specific to the usb protocol or just something funky specific to OSX. If it happens on other OS's then it might also be easier to debug as Linux has more utilities in that regard. Worst comes to worst we can try debugging the firmware on another chip with the uart lines broken out so that logs of debug messages can be added as Haata mentioned.

haata commented 9 years ago

So, I got ambitious tonight and tried to see if I could reproduce on my Mac Mini (Yosemite). Yep, I got it to freeze (though no bootloader led, likely because the keyboard was still powered).

Using my McHCK and a uart terminal, this is the last USB control packet that arrives on the device before freezing (cli is frozen). bmRequestType:0x2, bRequest:0x1, wValue:0x0, wIndex:0x81, len:0x0

Interestingly enough, there is also an unhandled USB control packet on keyboard init. bmRequestType:0x0, bRequest:0x3, wValue:0x1, wIndex:0x0, len:0x0

For posterity, I compiled with the usbMuxUart Output module (CMakeLists.txt) and uncommented the #define UART_DEBUG_UNKNOWN 1 in Output/pjrcUSB/arm/usb_dev.c

Before I pass out tonight I'll try to hunt down what these control packets are.

haata commented 9 years ago

Fixed! b2dfd63d4b4ecbde73a7a33e0c0353d85ed50250

Please test (re-open if necessary). I don't have a Mac Book so behaviour may be different. The fix was not ideal, but I'm not really using the CLEAR_FEATURE functionality right now (used to indicate microcontroller powersaving modes).

rossipedia commented 9 years ago

Awesome! Giving this a shot right now

rossipedia commented 9 years ago

Seems to have done the trick, thanks a bunch!

haata commented 9 years ago

I've opened another bug for the waking up a sleeping computer.