qmk / qmk_firmware

Open-source keyboard firmware for Atmel AVR and Arm USB families
https://qmk.fm
GNU General Public License v2.0
18.28k stars 39.38k forks source link

Half of ergodox infinity "disconnects" #1076

Closed rrix closed 6 years ago

rrix commented 7 years ago

I'm not even sure how to go about debugging this, but since moving from kiibohd to qmk, my non-dominant ergodox infinity (the half that is chained, not the half that is connected to my computer) will occasionally "disconnect" silently; the display is still lit, but none of the keys on that half are responsive until i re-set the entire unit by unplugging the dominant half from my PC and reattaching it.

I am running my own layout which was last rebased against 2e2b9962cdc20e9f46dd0194f25a68ffa05e7d36 , but this has been the case as long as I've used qmk with my infinity.

r2d2rogers commented 7 years ago

I have noticed this before on my two infinities. It almost completely disappeared when I swapped which half was the master. The odd thing is the "preferred" master is different between the two pairs. The set I'm on now works best with the left half as master, but at home it's the right that keeps them both awake.

r2d2rogers commented 7 years ago

Do you have the LCD screens active in the keymap you are using? I just rebuilt my keymap with the "VISUALIZER_ENABLE = no", swapped master to the other side, and have not had the slave side become unresponsive for over an hour. Usually this reliably happens within 30 minutes. I will enable the LCD and see if I can get the disconnect to reoccur.

rrix commented 7 years ago

I have the LCDs enabled, but I have opaque acrylic covering it, I will try to disable the visualizer, that seems good regardless.

r2d2rogers commented 7 years ago

I have still seen the behavior in one of my two ergodox infinitys. I hope to have time to use the debug to see if I can figure out what is happening.

chadharrington commented 7 years ago

I was seeing this problem also. Swapping which half was plugged into the computer cleared it up for me (at least so far).

fredizzimo commented 7 years ago

Could someone test if the same problem occurs with my TMK fork as well? Or if it's a QMK specific issue. I haven't been using the QMK version myself yet(been waiting for my effect system to become ready), so I haven't noticed anything.

r2d2rogers commented 7 years ago

I can test tomorrow @fredizzimo, I still have your TMK fork cloned.

Do you suggest testing with specific branches, or just master?

fredizzimo commented 7 years ago

The master or the led branch should both be fine. But its possible that it's related to the LED support.

r2d2rogers commented 7 years ago

I seem to be unable to use MASTER=right, but my current theory is cable quality and length. This was slowly typed with a reversed layout.

r2d2rogers commented 7 years ago

I have been testing with two separate infinities and the variable that can best predict the slave becoming unresponsive seems to be power related. A longer somewhat cheaper USB cable shows the issue almost immediately vs a better cable or plugged into a powered USB hub. I also notice the LCD displaying "Suspending..." before going unresponsive some times.

fredizzimo commented 7 years ago

@r2d2rogers, did you do those tests with my fork or QMK?

The fact that it goes into suspend mode is also interesting. Unless there's some really strange bug in the code that shouldn't happen if the computer doesn't send a suspend event and the slave relies on the master telling about it. So it probably means that the computer detects too high power requirements, and suspends the port, maybe just briefly.

The question here is why the slave doesn't wake up along with the master as happens during normal suspend? But maybe there's some timing issue, if suspend and resume happens too close to each other. Cable length and quality could also affect the timing, so it's not necessarily a too high power requirement that causes the issues.

Actually after writing the previous I did some google searches and it's more likely related to selective suspend https://blogs.msdn.microsoft.com/usbcoreblog/2011/05/10/demystifying-usb-selective-suspend/. If the issues is indeed selective suspend you should be able to turn it off https://www.groovypost.com/howto/usb-selective-suspend-windows-explained/, at least temporarily for testing.

r2d2rogers commented 7 years ago

I was testing using both qmk and the led branch of your fork, on Windows and Linux. The tmk fork seems more prone to it, or perhaps, the led cycling from the led branch causes the power issue leading to the selective suspend to occur more often. When I see the suspend message it doesn't occur on the master at all, and the slave will try to come back up, but quickly goes back to suspend in the cases I've seen.

fredizzimo commented 7 years ago

Hm. Then it could mean that somehow the unconnected USB port on the slave gets the suspend signal. I have to check the ChibiOS code to see when and if that can happen.

Another other option is that we have some memory corruption some where.

But there's also the third option, the suspend message actually occurs on the master, but it's too short to be noticeable. I'm currently betting on this case, and selective suspend.

r2d2rogers commented 7 years ago

@fredizzimo, have you tried using the right half as master on your set? I'm playing with one of your branches (logo and default visualizer) and I still get the slave (left in this case) suspending with the shorter cable and unloaded USB hub. I had not seen the suspend happening through the day yesterday with the left as master. Also, I see some warnings in the compile when using MASTER=right.

rbasoalto commented 7 years ago

Same here. Left as master completely stopped the disconnects. It was disconnecting anywhere from 1 to 5 times per day with right as master.

fredizzimo commented 7 years ago

Is this still an issue with all the latest updates?

r2d2rogers commented 7 years ago

@fredizzimo, I have not seen this behavior since the weekend changes, and I have a set plugged in with right half as master and what I thought were the most suspect USB cables, those that triggered the behavior almost instantly before the fixes.

@rrix, have you had a chance to pull the latest changes to check?

r2d2rogers commented 7 years ago

I may have been too hasty, as I have observed the issue again. However, I wonder if the issue isn't the cables after all. I think there's a short in one or both of them. They are also the least solidly connected in the sockets. I will be putting other cables to the test to see if I can get the behavior to happen without them.

Edit: This does still occur with the current code. I'm working on getting functional debugging working to see if I can trace the issue.

fionaguoguolu commented 7 years ago

I've been having the same issue as rrix since Jan, 2017. It was on and off for 6 months now it completely give up. Neither side can function as a master now, the behavior is consistent on both mac and linux. Tried to disable selective suspend, didn't work. Any idea what caused this?

r2d2rogers commented 7 years ago

Everything I've experienced seems to point at cable quality, some of my cables are good for the interconnect, and others from the same source are not. I also question the quality of the female connectors on the boards. I have settled into two setups that work, and the master is better on one side than the other, but on opposite sides of the two ergodox infinity sets I have.

drashna commented 6 years ago

Is this still an issue?

rbasoalto commented 6 years ago

I've since swapped cables and the issue disappeared completely. I'm running an up-to-date(ish) fork, with only keymap and visualizer changes, for 8-10h/day and it hasn't happened again. So I'd say this is no longer an issue.

fredizzimo commented 6 years ago

I'm closing this then