eruption-project / eruption

Realtime RGB LED Driver for Linux
https://eruption-project.org/
GNU General Public License v3.0
258 stars 32 forks source link

Keyboard's LED interface spams KEY_UNKNOWN after Eruption updates the keyboard LEDs #226

Open Phen-Ro opened 11 months ago

Phen-Ro commented 11 months ago

Hello. In a attempt to avoid backtracking and rambling (editors note: I failed), I'm going to relate this story chronologically as I experienced it rather than just summarizing the end results.

Many weeks ago I was tinkering with my system and had need to work on the virtual console. I noticed the cursor blinking faster than usual, and I was unable to log in with my password. While the system was thinking about the bad password input, it would spray ^@ symbols onto the console. I believe these are the representation of the null character. Worried that something was broken with my Roccat Vulcan 120 keyboard, I unplugged it and plugged in a different keyboard. The fast blinking stopped and I was able to log in as usual.

Through additional experimentation, I learned that this only happened when Eruption was running. I have been using the develop branch. I assumed some bug within Eruption and went about my life. This was on Kubuntu 23.04. Some time later, I switched to openSUSE Tumbleweed (partially to check if the grass is greener (it's not)). The same problem exists there, too. openSUSE has OpenRGB in its repos, so I installed that and gave it a whirl. Imagine my surprise when the same thing happened with OpenRGB, too!

Actually, it wasn't exactly the same. With OpenRGB, it would spam the a key rather than ^@. Why this is is still a mystery to me. I found a open issue from 8 months ago describing the problem here. A recommendation is to use Eruption, but no follow-up was given. I'll be replying to that thread soon.

So both Eruption and OpenRGB show the same problem. You know what program doesn't have this problem? roccat-vulcan. I poured through every byte sent through hidraw in the three codebases and could not find any meaningful difference. The initialization sequences are identical, and the LED map submission (for the OpenRGB mode of "direct") is also the same. So wtf.

The bytes sequences / HID reports in the three programs are the same. So what the heck is going on? I modified the eruption-debug-tool and eruption-hwutil programs to include an interactive mode that only sends the hidraw reports after a keypress. If I had done this earlier, I would have learned that the problem doesn't occur from the initialization sequence at all, but only after the first LED map report and only after it's completely sent.

Initially I was using the showkey program to help me diagnose this. Eventually I learned how to use evtest, which is much more useful. evtest showed me that this problem is always associated with the last event device associated with the keyboard. When listening to it after that final LED map report is sent, it continuously spews output like this:

Event: time xxx.xxx, -------------- SYN_REPORT ------------
Event: time xxx.xxx, type 1 (EV_KEY), code 240 (KEY_UNKNOWN), value 2
Event: time xxx.xxx, -------------- SYN_REPORT ------------
Event: time xxx.xxx, type 1 (EV_KEY), code 240 (KEY_UNKNOWN), value 2

With Eruption, it reports KEY_UNKNOWN. With OpenRGB, it's KEY_A, but with roccat-vulcan, the event device isn't even present! This was the Big Clue. roccat-vulcan works while the others don't, even though it uses the same byte sequences, because it uses hidapi's libusb backend rather than hidraw directly. A comment in its source code reads, "For LED device, use hidapi-libusb, since we need to disconnect it from the default kernel driver." I don't fully understand the difference, but I know this is the important part. I also noticed that while eruption's development branch uses the linux-native backend of hidapi, the master branch uses libusb too. The commit for this mentions "the hidraw backend still triggers bugs" - is this one of them?

Incidentally, I tried upgrading the hidapi dependency from your fork at rev a842fd6 to the latest from ruabmbua, which is what's published at crates.io. While this didn't change the bug behavior, it also didn't break anything, as far as I could tell. So you may want to consider going back to the crates.io version of the library.

Okay, so the problem stems from the hidapi backend. Well, I tried swapping the backends around, and the code did not like that for reasons I didn't have the effort to investigate. Instead, I chose to hammer at what exactly that evdev device was doing.

That last device is tied to USB interface 3 of the keyboard, which sends and receives reports on the LED lights for the Vulcan 1xx keyboard. Under normal conditions, if you're listening to it using evtest, you'll see it print events when you press the capslock or numlock keys, for example, since these update the LEDs. So there's no reason it ought to be spamming actual keystrokes. I forget exactly how I got there, but somehow I managed to stumble into the sysfs view of the device, easier to find at /sys/class/input/eventX/device, where X is the same number found from the last entry for the keyboard in evtest. There I found the inhibited file. In a fit of desperation, I wrote a "1" into it, and the evtest output ceased.

Then I wrote a "0" into it, expecting the evtest output to continue. It did not. I could still hit the capslock key and see its event, but the flood of KEY_UNKNOWN had finally stopped.

At face value, this makes no sense. Toggling the inhibited flag back and forth should send the device back to its original state. That it does not convinces me that this is a bug waaay upstream somewhere within the udev/evdev/hidraw/something stack. That's not a battle I wish to fight, so I decided to pursue toggling the inhibited flag within Eruption code as a viable workaround for this bug.

Udev time! I'll cut the story short and say that after a sufficient quantity of head-to-wall impacts, I came up with the code in eruption-debug-tool/src/util.rs. Just flip that inhibited attribute on and off, and everything works fine. But only when done after the first LED map report, not before it. And it only works when run by root, not the eruption user. With a great deal more impacts, I was surprised to find out that this udev rule here of simply making the inhibited file world-writeable would let me set that attribute. Everything "works" "fine" after that. Yeah, this is a serious pile of hacks. But it works. And using setfacl did not work, only chmod'ing it did.

So this is the current state of that PR. I would not resent you for not merging it, or only merging part of it. The workaround is only applied to the Vulcan 1xx device, since that's the only one I own that has the problem. I also own the Elo 7.1 Air headphones, but they do not suffer from this, as they are not a keyboard device.

TL;DR, here's a summary of the problem itself: When using the linux-native backend of hidapi, the LED interface device spams KEY_UNKNOWN key events as soon as the initial LED map report is submitted.

And a summary of the changes:

X3n0m0rph59 commented 10 months ago

@Phen-Ro Thank you very much for this brilliant analysis and the great write up!

This will enable us to finally drop support for the legacy libusb backend that hampered compatibility with 3rd party tools which require low-level access to USB devices.