atar-axis / xpadneo

Advanced Linux Driver for Xbox One Wireless Controller (shipped with Xbox One S)
https://atar-axis.github.io/xpadneo/
GNU General Public License v3.0
1.95k stars 112 forks source link

Xbox Elite Series 2 intermittently but consistently locks up #247

Closed solystm closed 3 years ago

solystm commented 3 years ago

Version of xpadneo

v0.8 (I also have xow 0.5-17-gaf5b9d4 installed, but the behavior is the same with xpadneo alone, xow alone, and xpadneo with xow)

Severity / Impact

It completely prevents the controller from being used through Bluetooth for more than between 5 minutes and an hour, depending on some factors I'm not sure on. After that period it locks up and needs to be powered off and then reconnected again. Games generally need to be restarted in order to re-detect it.

Describe the bug

The controller connects fine and works well, reporting battery, all buttons working, even the underpad triggers (though that may be xow's doing?) I start playing a game (tested with Trials of Mana and Streets of Rage 4 primarily) and it's detected and works as expected in the game. However, after a period of time somewhere between 5 minutes (at the earliest) and about an hour (at the latest) the controller will lock up and stop working until it's powered off (hold Xbox button) and then powered back on again.

By "lock up" I mean the controller's input 'sticks' on whatever it was doing when the bug occurred. So, for example, if i was holding right on the left analog, it would continue holding right even if i let go entirely. Other buttons cease to register at this time. For example, pressing ABXY gets no input in game or in a controller calibration utility, when both work fine normally. The controller calibration utility will report the last state (for example, the left analog being held to the right) and no longer update. The controller stays connected to bluetooth from what i can see.

Steps to Reproduce

Connect up, start a game, and play. Eventually, it will drop the connection. With that said, it seems to be associated more with left analog stick inputs. It's never actually locked up while the left analog was neutral. That includes an extended session of Streets of Rage 4 only using the dpad, but when I switched to the left analog it locked up pretty much immediately. Given the wide range in the amount of time that passes it's difficult to say if it's related but it sure seems that way.

Expected behavior

It should continue to work normally without doing that :)

Screenshots/Gifs

I could take a video of me giving the pad inputs while nothing is registered by a game/controller calibration tool, but I'm not sure that would be terribly useful.

System information

# uname -a
Linux localhost.localdomain 5.8.15-201.fc32.x86_64 #1 SMP Thu Oct 15 15:56:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
(Also reproduced the same error with the same behavior from a Dell XPS13 on Fedora 32, was on the same kernel as my main computer when it was reproduced)
# xxd -c20 -g1 /sys/module/hid_xpadneo/drivers/hid:xpadneo/0005:045E:*/report_descriptor | tee >(cksum)
00000000: 05 01 09 05 a1 01 85 01 09 01 a1 00 09 30 09 31 15 00 27 ff  .............0.1..'.
00000014: ff 00 00 95 02 75 10 81 02 c0 09 01 a1 00 09 33 09 34 15 00  .....u.........3.4..
00000028: 27 ff ff 00 00 95 02 75 10 81 02 c0 05 01 09 32 15 00 26 ff  '......u.......2..&.
0000003c: 03 95 01 75 0a 81 02 15 00 25 00 75 06 95 01 81 03 05 01 09  ...u.....%.u........
00000050: 35 15 00 26 ff 03 95 01 75 0a 81 02 15 00 25 00 75 06 95 01  5..&....u.....%.u...
00000064: 81 03 05 01 09 39 15 01 25 08 35 00 46 3b 01 66 14 00 75 04  .....9..%.5.F;.f..u.
00000078: 95 01 81 42 75 04 95 01 15 00 25 00 35 00 45 00 65 00 81 03  ...Bu.....%.5.E.e...
0000008c: 05 09 19 01 29 0b 15 00 25 01 75 01 95 0b 81 02 15 00 25 00  ....)...%.u.......%.
000000a0: 75 01 95 05 81 03 05 0c 0a 24 02 15 00 25 01 95 01 75 01 81  u........$...%...u..
000000b4: 02 15 00 25 00 75 07 95 01 81 03 05 0c 09 01 a1 01 0a 81 00  ...%.u..............
000000c8: 15 00 26 ff 00 95 01 75 04 81 02 15 00 25 00 95 01 75 04 81  ..&....u.....%...u..
000000dc: 03 0a 84 00 15 00 26 ff 00 95 01 75 04 81 02 15 00 25 00 95  ......&....u.....%..
000000f0: 01 75 04 81 03 0a 85 00 15 00 26 ff 00 95 01 75 08 81 02 0a  .u........&....u....
00000104: 99 00 15 00 26 ff 00 95 01 75 04 81 02 15 00 25 00 95 01 75  ....&....u.....%...u
00000118: 04 81 03 0a 9e 00 15 00 26 ff 00 95 01 75 08 81 02 0a a1 00  ........&....u......
0000012c: 15 00 26 ff 00 95 01 75 08 81 02 0a a2 00 15 00 26 ff 00 95  ..&....u........&...
00000140: 01 75 08 81 02 0a a3 00 15 00 26 ff 00 95 01 75 08 81 02 0a  .u........&....u....
00000154: a4 00 15 00 26 ff 00 95 01 75 08 81 02 0a b9 00 15 00 26 ff  ....&....u........&.
00000168: 00 95 01 75 08 81 02 0a ba 00 15 00 26 ff 00 95 01 75 08 81  ...u........&....u..
0000017c: 02 0a bb 00 15 00 26 ff 00 95 01 75 08 81 02 0a be 00 15 00  ......&....u........
00000190: 26 ff 00 95 01 75 08 81 02 0a c0 00 15 00 26 ff 00 95 01 75  &....u........&....u
000001a4: 08 81 02 0a c1 00 15 00 26 ff 00 95 01 75 08 81 02 0a c2 00  ........&....u......
000001b8: 15 00 26 ff 00 95 01 75 08 81 02 0a c3 00 15 00 26 ff 00 95  ..&....u........&...
000001cc: 01 75 08 81 02 0a c4 00 15 00 26 ff 00 95 01 75 08 81 02 0a  .u........&....u....
000001e0: c5 00 15 00 26 ff 00 95 01 75 08 81 02 0a c6 00 15 00 26 ff  ....&....u........&.
000001f4: 00 95 01 75 08 81 02 0a c7 00 15 00 26 ff 00 95 01 75 08 81  ...u........&....u..
00000208: 02 0a c8 00 15 00 26 ff 00 95 01 75 08 81 02 c0 05 0c 09 01  ......&....u........
0000021c: 85 02 a1 01 05 0c 0a 23 02 15 00 25 01 95 01 75 01 81 02 15  .......#...%...u....
00000230: 00 25 00 75 07 95 01 81 03 c0 05 0f 09 21 85 03 a1 02 09 97  .%.u.........!......
00000244: 15 00 25 01 75 04 95 01 91 02 15 00 25 00 75 04 95 01 91 03  ..%.u.......%.u.....
00000258: 09 70 15 00 25 64 75 08 95 04 91 02 09 50 66 01 10 55 0e 15  .p..%du......Pf..U..
0000026c: 00 26 ff 00 75 08 95 01 91 02 09 a7 15 00 26 ff 00 75 08 95  .&..u.........&..u..
00000280: 01 91 02 65 00 55 00 09 7c 15 00 26 ff 00 75 08 95 01 91 02  ...e.U..|..&..u.....
00000294: c0 05 06 09 20 85 04 15 00 26 ff 00 75 08 95 01 81 02 06 00  .... ....&..u.......
000002a8: ff 09 01 a1 02 85 06 09 01 15 00 25 64 75 08 95 01 b1 02 09  ...........%du......
000002bc: 02 15 00 25 64 75 08 95 01 b1 02 09 03 15 00 26 ff 00 75 08  ...%du.........&..u.
000002d0: 95 01 b1 02 09 04 26 ff 00 75 08 95 3c b2 02 01 c0 06 00 ff  ......&..u..<.......
000002e4: 09 02 a1 02 85 07 09 05 15 00 25 64 75 08 95 01 b1 02 09 06  ..........%du.......
000002f8: 15 00 25 64 75 08 95 01 b1 02 09 07 15 00 25 64 75 08 95 01  ..%du.........%du...
0000030c: b1 02 c0 06 00 ff 09 03 a1 02 85 08 09 08 15 00 25 64 75 08  ................%du.
00000320: 95 01 b1 02 09 09 15 00 25 64 75 08 95 01 b1 02 09 0a 15 00  ........%du.........
00000334: 26 ff 00 75 08 95 01 b1 02 c0 06 00 ff 09 04 a1 01 85 09 09  &..u................
00000348: 0b 15 00 25 64 75 08 95 01 b1 02 09 0c 15 00 25 64 75 08 95  ...%du.........%du..
0000035c: 01 b1 02 09 0d 15 00 25 64 75 08 95 01 b1 02 09 0e 15 00 26  .......%du.........&
00000370: ff 00 75 08 95 01 b1 02 09 0f 26 ff 00 75 08 95 3c b2 02 01  ..u.......&..u..<...
00000384: c0 06 00 ff 09 05 a1 01 85 0a 09 10 15 00 27 ff ff ff 7f 75  ..............'....u
00000398: 20 95 01 81 02 09 11 15 00 27 ff ff ff 7f 75 20 95 01 81 02   ........'....u ....
000003ac: 09 12 15 00 26 ff 00 75 08 95 02 81 02 09 13 15 00 26 ff 00  ....&..u.........&..
000003c0: 75 08 95 01 81 02 c0 06 00 ff 09 06 a1 02 85 0b 09 14 15 00  u...................
000003d4: 25 64 75 08 95 01 b1 02 c0 c0 05 01 09 06 a1 01 85 05 05 07  %du.................
000003e8: 19 e0 29 e7 15 00 25 01 75 01 95 08 81 02 95 01 75 08 81 03  ..)...%.u.......u...
000003fc: 95 06 75 08 15 00 25 65 05 07 19 00 29 65 81 00 c0           ..u...%e....)e...
1914565075 4781

Controller and Bluetooth information

xpadneo-btmon.txt xpadneo-dmesg.txt xpadneo-lsusb.txt

Additional context

Issue was reproduced on a desktop using Asus X99 Deluxe built-in bluetooth also reproduced on a Dell XPS13, both computers on Fedora 32 and roughly up to date with updates. Attempted to reproduce issue on a Windows 10 laptop, it didn't happen there but given the intermittent nature I wasn't certain it wouldn't.

I saw some other strange behavior out of the controller, including it not reporting the battery level at all even in Windows, so i ended up returning it for a second controller which works better (reports battery for instance) but exhibits this same behavior. I THINK it behaved the same way both on the firmware the controller had out of the box as well as the most up-to-date firmware (as per the Xbox Accessories app on Windows) but it definitely happens on the up-to-date one.

This only occurs on Bluetooth, I haven't had it happen with USB direct connection or the wireless adapter dongle. (I may not have given it enough time with the dongle, though.)

I'm not sure the best option for this, but capturing data while the issue occurs might help. I can't make it happen on command so I'd need to be using the controller over a period of time while logging.

kakra commented 3 years ago

Welcome to our community! This is an excellent example of a bug report and well done.

This report has a few interesting aspects but I'm not quite sure when we are looking at xpadneo alone and can be sure of it. But the interesting part is that you found the left stick being involved. What about the right stick?

Thanks to a donation from the community, I'm having the XBE2 controller, too. The current believing is that turning off rumble seems to work around the firmware lockup of the controller. But even back during development of XB1S controller support, this bug looked like the stick may be involved because I was sometimes seeing only the sticks freezing on some value while the buttons continued to work for some seconds.

About your battery reporting issue: Is this with xpadneo? Because to me it looks like the battery is correctly identified as Play'n'Charge kit. Your dmesg also shows it working:

[482252.402942] xpadneo 0005:045E:0B05.000C: fixing up report size
[482252.402945] xpadneo 0005:045E:0B05.000C: fixing up Rx axis
[482252.402947] xpadneo 0005:045E:0B05.000C: fixing up Ry axis
[482252.402948] xpadneo 0005:045E:0B05.000C: fixing up Z axis
[482252.402949] xpadneo 0005:045E:0B05.000C: fixing up Rz axis
[482252.402951] xpadneo 0005:045E:0B05.000C: fixing up button mapping
[482252.403967] xpadneo 0005:045E:0B05.000C: battery detected

"battery detected" means that we received the first battery report from the controller and could identify the bits correctly.

[482273.058955] xpadneo 0005:045E:0B05.000C: pretending XB1S Windows wireless mode (changed PID from 0x0B05 to 0x02E0)
[482273.058958] xpadneo 0005:045E:0B05.000C: enabling compliance with Linux Gamepad Specification
[482273.059073] input: Xbox Elite Wireless Controller as /devices/pci0000:00/0000:00:14.0/usb3/3-6/3-6:1.0/bluetooth/hci0/hci0:11/0005:045E:0B05.000C/input/input43
[482273.059685] xpadneo 0005:045E:0B05.000C: input,hidraw0: BLUETOOTH HID v9.03 Gamepad [Xbox Elite Wireless Controller] on 28:c2:dd:de:82:63
[482273.059690] xpadneo 0005:045E:0B05.000C: controller quirks: 0x00000018
[482274.049945] xpadneo 0005:045E:0B05.000C: Xbox Elite Wireless Controller [98:7a:14:55:31:55] connected
[482282.909308] xpadneo 0005:045E:0B05.000C: battery registered

"battery registered" means we have identified the battery type and registered it with the power supply subsystem of Linux. From here on it's up to user space to make any use of it.

[483985.613673] xpadneo 0005:045E:0B05.000C: shutting down

"shutting down" is also a battery report and will be sent when the controller automatically turns off due to depleted battery or idle timeout (no input in 10-15 minutes), or because you forced it to shut down by holding the Xbox logo button. It will (and can) never be shown if Bluetooth disconnected because we wouldn't read a battery event then.

So if your controller input froze between the last two lines, it'll mean that Bluetooth is still functional. That would also explain why the Bluetooth stack doesn't disconnect because of transmit/receive timeout.

I've already seen two different pictures:

  1. In the first case, the controller crashes/locks up, the Bluetooth connection breaks, after a timeout of 30s, the controller restarts, re-connects, and the controller becomes available in most games again (some games just don't correctly support hot plugging). It will be back working after 30-60s. In this case, the controller seems to know it crashed due to rumble and no longer rumbles but the connection will be stable now throughout the game.

  2. In the second case, the controller ceases sending any more input, and rumble may "freeze" in time. But Bluetooth doesn't timeout, so the Bluetooth handler in the controller firmware seems to be still responsive. But some internal state locked up and no longer generates input events. To get the controller back, you may need to power-cycle it, and games usually do not re-detect it afterwards. That's probably because the kernel doesn't unregister the old input device before initializing the new one, and then it seems like a different device to games. I'd expect the following observation here: You're turning the controller off and back on but the kernel doesn't notice that because the Bluetooth connection didn't break. Now, a new device will appear and initialize a new driver instance, and just the very same moment it will tear the old driver instance down - but it already has a new instance ID now.

I've also seen the following behavior: Run jstest and generate a continuous stream of input events by circling either one or the other stick smoothly. With some patience this also works with the triggers (tho, a lot more difficult). You may see that jstest suddenly stops seeing any updates to the reported values. Stop movement, and 1s later it will be back with reporting values. That can be easily repeated. If you're watching btmon at the same time, you'll see that the controller stops sending data to us. So this is a firmware problem, and I'm pretty sure it comes from the same origin as your problem. Rumble isn't involved at all here. Interestingly, you may no longer be able to reproduce this after rebooting the controller or re-pairing Bluetooth.

To clear some things up:

If you're using xow, you're not using the Bluetooth connection. xpadneo won't be involved here, it cannot see the controller. The same is true vice-versa: xow doesn't handle Bluetooth but only the dongle connection.

If you're using USB, you will be using the xpad driver. Neither xow nor xpadneo support the USB wired connection of the controller: xow may probably support it because the wireless dongle protocol and wired USB protocol are very similar but the Bluetooth protocol is completely different: Bluetooth uses HID, while dongle and USB use very similar variants of GIP - if not identical protocols.

So first question would be: Did your games use rumble effects? If yes, could you disable rumble and see if behavior changes? You can use the module parameter rumble_attenuation=100 to effectively stop the driver from sending rumble commands to the controller at all.

After all, we are back again at some very weird timing bug or firmware race of the controller which only occurs in wireless mode (either one) but not in wired mode. For XB1S this was fixed by keeping intervals between rumble commands above 10ms. But that doesn't seem to be a work-around for XBE2: For me it usually crashes at the first rumble packet no matter which timing is used. I wonder what the problem is because I don't think it happens with Windows or Android. Both systems use a different kernel-side Bluetooth implementation. So the actual Bluetooth protocol implementation may matter here (which would still be a bug in the controller firmware: we should not be able to crash it by sending packets in some different order or timing). At least the Windows driver also keeps rumble command timings above intervals of 10ms, which by the way is the timing resolution of the rumble motor programming interface.

Next question would be: Does xow show the same problem - tested with and without rumble?

I don't expect the USB connection to show any problems, so we probably don't need to test that. Also, it's a different implementation of the protocol and we would be comparing apples and oranges. By that, it even doesn't work as a reference to compare. For xow it would but not for xpadneo.

About the paddle buttons: As far as I found out, these are mapped to events in hardware. While we can see 4 bits that show the paddle state, we cannot tell them apart from A,B,X,Y because in default profile 0 (all profile LEDs are off), it'll be mapped to A,B,X,Y in hardware: The protocol shows P1-P4 bits set to 1 along with A,B,X,Y bits set to 1, so we cannot know the real state of A,B,X,Y when P1-P4 shows up. When switching to a different profile, we don't see any of those bits. So it seems like P1-P4 are only indicated by bits if they are programmed. The HID descriptor shows that there's a programming command to upload macros to the controller. I didn't yet decode that or record any programming from a Windows VM to reverse engineer it. But that's planned for v0.10 or v0.11. Currently, we will detect that the profile switch button was used and which profile number it activated. Current master branch has the proper commits in place for that. My next commit will be to detect the position of the pressure limiter for the triggers. So yes, the paddles will map to A,B,X,Y and that's implemented in hardware for profile 0, it's not a driver thing. If you switch to profile 1-3, the paddles will become invisible to xpadneo currently, not sure about xow because that is using GIP instead of HID.

solystm commented 3 years ago

Thank you for the kind words, I try to submit solid bug reports so I'm glad to hear it was helpful!

With that said, I may not have correctly captured the issue, looking at it. When it occurs, as far as I can tell, both the computer and the controller continue to believe they're connected until I manually disconnect the controller. I might be wrong about that, though, so I think I need to re-test. I'll do that and attach here, though it's difficult to get a reproduction due to the intermittent nature of the issue.

The issue only occurs while connected through Bluetooth, I believe xow does have some degree of Bluetooth support as I've connected through Bluetooth and reproduced while only xow is installed and not xpadneo. I could double check with xow and no xpadneo, though. I'm not seeing this behavior when using the wireless dongle or USB and I agree we probably don't need to test those. I'll uninstall xow from my system and go forward exclusively with xpadneo for now to reduce confusion.

I haven't tried with rumble disabled. The games do both have rumble effects, so I'll give disabling it a try. Keep in mind that may require over an hour of play time so the btmon output will probably be pretty large and may take me a while to collect. With that said, the rumble DOES work at least sometimes and doesn't always result in this behavior. I'll try to generate some rumble events, though.

So, the reproduction this time is done with the following: Removed xow and going forward with only xpadneo. Removed the pairing in Bluetooth. Running btmon and outputting to file. Pairing the controller through Bluetooth again. Running jstest and outputting to file. Launch Trials of Mana and play I got some rumble events near the start of the actual play time and the controller continued working as expected.

Several minutes later the issue occurred and the controller stopped responding.

Side note: I checked and the controller may actually still be working on some level. The profile select button still works--I can press the button and the profile lights change as expected. However, the jstest does not change. The final line is where it remains regardless of how long I let it sit:

Axes: 0:-32767 1: 10283 2:-32767 3: 0 4: 0 5:-32767 6: 0 7: 0 8: 0 Buttons: 0:off 1:off 2:off 3:on 4:off 5:off 6:off 7:off 8:off 9:off

I manually powered off the device by pressing and holding the button on the controller, then waited for that to time out on my system. Note that where it ended up in the btmon.txt is where it was when I killed the monitoring. I gave it several minutes to see if it would gracefully disconnect or something, but even after I manually powered off the controller it stayed at this final line.

xpadneo-btmon.txt xpadneo-jstest.txt

I killed the original monitoring and restarted with a fresh file, then reconnected the controller by again pressing the xbox button on it. I gave it some inputs on the left analog, then powered the controller off again by holding the xbox button. That's this file:

xpadneo-btmon-reconnect.txt

So, next: despite the fact that rumble does work some times, it's possible a rumble event is connected to the issue anyway. I'm going to disable rumble. I could also try disconnecting the bluetooth from the computer side and seeing what happens. I'll try that this time though I don't think it'll work because the controller usually tries to reconnect. Might get some interesting behavior, though.

solystm commented 3 years ago

Alright, trying with rumble disabled. Starting from where I ended the last comment.

Ran btmon and output to file. Hit xbox button on the controller to let it reconnect. Disabled rumble by modifying /sys/module/hid_xpadneo/parameters/rumble_attenuation to have:

cat /sys/module/hid_xpadneo/parameters/rumble_attenuation 
100,0

Ran jstest and output to file. Launch Trials of Mana and play.

And... I'm convinced it works! Rumble had actually worked at least a little, but turning it off keeps the controller working consistently, so that's a solid work-around.

I wonder if I could just turn off the trigger rumble or if it needs to be the whole thing. I mean, I'm actually fine without rumble for most games but now I'm curious.

I'd upload the files I created by it's >200MB for the both of them so that's probably not great. Regardless, thank you for the assistance. If you have something you'd like me to test further, please let me know.

(Edit: Nope, gotta be 100,0. Setting it to 0,100 still gets the freeze.)

kakra commented 3 years ago

I believe xow does have some degree of Bluetooth support as I've connected through Bluetooth and reproduced while only xow is installed and not xpadneo

No, it actually doesn't. If you don't use xpadneo, the built-in kernel modules will take over: both hid-microsoft and hid-generic support the controller via Bluetooth, hid-microsoft even supports rumble. xow isn't built into the kernel, it's purely a user-space driver: The service must be running for it to do anything, and the service only looks for the dongle and nothing else. So you can have both installed at the same time, there's no conflict. It only depends on whether you are using the dongle or Bluetooth for connection. But there's one caveat: The controller may change behavior a little when swapping it between both connection modes: Sometimes it shows a different packet format in xpadneo when it has been connected to the dongle previously, especially if you didn't explicitly re-pair it to Bluetooth. xpadneo has a work-around for it, otherwise button mappings would be completely messed up.

The kernel built-in drivers hid-microsoft and hid-generic both fully support the controller but without some of the features and fixups that xpadneo brings, and no rumble in the hid-generic case. In early times of development, xpadneo was the only driver to properly fix the button and axis mapping to be compatible with all games. Meanwhile, SDL and Proton has fix-ups for it and you should usually see no difference - unless such a SDL/Proton fix-up gets accidentally applied when using the xpadneo driver: Then you'll see messed up mappings with xpadneo.

200+ MB logs are not very useful. In fact, btmon logs are only useful when looking at a very specific problem, and then it should capture only that part. I'm just asking for it in the report to see if there're obvious Bluetooth pairing problems, and to see which versions are in use.

If you're still seeing battery reporting problems, the output of upower -d would be a good start.

I'm still curious why you were seeing no problems with using the dpad alone: Did that game support rumble at all? But even if, I'm not sure how we could use that to work-around the problem because we have no control over when the controller sends data to us, we won't be requesting axes and button states, it just sends it to us. The only thing I can throttle is sending rumble commands to the controller. We cannot even control what Bluetooth does because that's in a different layer but I'm sure if we could fix anything, it would be needed there. Luckily, the xow author is currently working on an in-kernel driver for the dongle. Plans are that we migrate xpadneo over to using the dongle at some later time - probably merging efforts of xow and xpadneo into one project. That's currently not possible because xow lives purely outside of the kernel.

kakra commented 3 years ago

Side note: I checked and the controller may actually still be working on some level. The profile select button still works--I can press the button and the profile lights change as expected. However, the jstest does not change. The final line is where it remains regardless of how long I let it sit

Yeah, that's somewhat expected: Then it's some firmware component in the controller that crashes. We already found that the firmware seems to be several autonomous parts running independently from each other, and one of them may crash and the controller doesn't always automatically recover by restarting itself. The profile LEDs are actually purely a hardware implementation: We do not control which LEDs light up. So unless you're seeing the profile announced at the same time in dmesg, some important part of the controller crashed.

solystm commented 3 years ago

I see, all very interesting to know.

As far as upower -d:

upower -d
[UPS device omitted]

Device: /org/freedesktop/UPower/devices/gaming_input_xpadneo_battery_0
  native-path:          xpadneo_battery_0
  model:                Xbox Elite Wireless Controller [98:7a:14:55:31:55] Play'n Charge Kit
  power supply:         no
  updated:              Wed 28 Oct 2020 10:03:40 AM EDT (103 seconds ago)
  has history:          yes
  has statistics:       yes
  gaming-input
    rechargeable:        yes
    warning-level:       none
    battery-level:       full
    percentage:          100% (should be ignored)
    icon-name:          'battery-full-charged-symbolic'

[UPS device omitted]

Daemon:
  daemon-version:  0.99.11
  on-battery:      no
  lid-is-closed:   no
  lid-is-present:  no
  critical-action: PowerOff

The game I was using the dpad on (Streets of Rage 4) does have rumble in it, but I'm not certain I did enough testing to trigger the bug to occur. Neither SoR4 nor Trials of Mana feature a significant amount of rumble in them.

(Edit: I can't do this right now because of other ongoing things, but I should probably just restart the computer. I've been messing around with stuff and something might be confused.)

There seems to be a lot of logic in this controller in particular (for example, the triggers reporting different values when the locks are set differently), so yeah, that makes sense that some parts would fail while other parts still work. That said, I don't appear to see profiles show up in dmesg at all? Maybe because I haven't set any and it's just the default. I'm not sure, I haven't been paying much attention to the profiles.

On another note, you've been super helpful. I can't donate a whole controller (but I guess you already have one so that's good!) but I'd throw a few bucks into the project if you had a donation link somewhere.

kakra commented 3 years ago

That said, I don't appear to see profiles show up in dmesg at all? Maybe because I haven't set any and it's just the default. I'm not sure, I haven't been paying much attention to the profiles.

It probably needs the work-in-progress v0.9 branch, I'm not sure if I merged those patches into v0.8. The v0.9 branch lives in the master branch currently. In my fork of xpadneo, there may be some proposed v0.9 patches not yet merged. I think currently I merged all patches for v0.9.

I've seen the profile messages in your dmesg logs:

[239045.920015] xpadneo 0005:045E:0B05.0008: Switching profile to 1
[239047.198250] xpadneo 0005:045E:0B05.0008: Switching profile to 2
[239047.818256] xpadneo 0005:045E:0B05.0008: Switching profile to 3
[239048.278248] xpadneo 0005:045E:0B05.0008: Switching profile to 0

These lines show up when you press the center button below the Xbox logo button: It switches the LEDs, and thus the profile. But the driver is just receiving this event, the LEDs itself are switched inside the controller firmware, we cannot control that (at least, I don't know how if we could).

kakra commented 3 years ago

On another note, you've been super helpful. I can't donate a whole controller (but I guess you already have one so that's good!) but I'd throw a few bucks into the project if you had a donation link somewhere.

Yeah, I've already got this controller. There's a donation button somewhere but that currently goes to @atar-axis who kindly has given me write access to this repository. He also suggested that he would share donations if you'd attribute it to some feature I'm implementing. Last time, I denied his sharing offer because the feature I'm working on wasn't about the actual feature that was requested for these donations. I've set up some projects in the projects tab which you may attribute donations to. I'm not sure, tho, how closely @atar-axis is currently watching this. I'm currently the main maintainer here.

kakra commented 3 years ago

May be a duplicate of #243

kakra commented 3 years ago

As far as upower -d

Looks fine, battery is detected properly and shows "fully charged". But I must admit that I didn't test the battery indicator yet because I wasn't using the controller much except for some tests, and put it back onto its charging pad otherwise.

But I've just tested it just now and something is wrong there: While I see "battery detected", it no longer showed "battery registered" and thus it doesn't show up in upower... Hmm, that's strange. It looks a bit fishy, like hid-generic may interfere here - but that makes no sense. I'll let you know about any new findings on my behalf, I'm lacking some time today.

solystm commented 3 years ago

Interesting that it shows sometimes in dmesg. When I fired it up and specifically hit the button to test, I didn't see it. Well, so it goes.

Regarding upower: I may be overestimating what it's doing. When I first hooked the controller up (fresh out of the box) it was reporting 70% battery life, then I charged it and it went up to 100% so I was like "oh neat, it's reporting the current battery". However, that may have just been incidental.

kakra commented 3 years ago

Yeah, the controller doesn't tell us the battery level while charging. It also doesn't tell us a percentage, it just has 4 levels. So there's a lot of guessing work.

maharifu commented 3 years ago

Hi, @kakra above you say this was solved for the XB1S controller, but I'm having the same issue with that controller (model 1708). I start playing a game with rumble on and after a few minutes the controller stops responding.

I already updated the controller to the latest firmware. Also I'm running Arch Linux Linux lc-desktop 5.9.1-arch1-1 #1 SMP PREEMPT Sat, 17 Oct 2020 13:30:37 +0000 x86_64 GNU/Linux on the latest version of xpadneo-dkms-git from AUR - 0.8.r29.gd55e6d4-1.

Should I provide more debugging information? Or handle it as a separate issue? Or am I just doing something wrong? :)

kakra commented 3 years ago

@maharifu It is possible that you are using a current version of SDL2 which can bypass the driver for rumble when raw HID permissions are available. You may want to check the permissions (including ACLs) of /dev/hidraw*, you may want to look at dmesg to see which device number was assigned:

[125545.334377] xpadneo 0005:045E:02E0.0016: input,hidraw15: BLUETOOTH HID v4.08 Gamepad [Xbox Wireless Controller] on 00:1a:7d:da:71:15

You could also try exporting SDL_JOYSTICK_HIDAPI=0 to your environment which will prevent SDL games from using the raw HID interface.

If this proves to fix the problem, we're going to need a patch for SDL2 which I'd be willing to create.

Otherwise, there's still a chance that your kernel is running NO_HZ_FULL or at a very low HZ frequency which can make the timer unstable (it should be 300 Hz at least). In the current master branch I've changed the workqueue for this to run on a dedicated high-priority queue. So you may also want to try that: https://github.com/atar-axis/xpadneo/commit/d55e6d42ecb53f3ebe91e7a43574c35e79146dfd

I'm not sure how to check the model of my XB1S... But since XBE2 still has the rumble problem, there may be another issue not yet uncovered.

kakra commented 3 years ago

On another note, you've been super helpful...

@solystm I think your donation reached me. Thanks.

maharifu commented 3 years ago

@maharifu It is possible that you are using a current version of SDL2 which can bypass the driver for rumble when raw HID permissions are available. You may want to check the permissions (including ACLs) of /dev/hidraw*, you may want to look at dmesg to see which device number was assigned:

[125545.334377] xpadneo 0005:045E:02E0.0016: input,hidraw15: BLUETOOTH HID v4.08 Gamepad [Xbox Wireless Controller] on 00:1a:7d:da:71:15

You could also try exporting SDL_JOYSTICK_HIDAPI=0 to your environment which will prevent SDL games from using the raw HID interface.

If this proves to fix the problem, we're going to need a patch for SDL2 which I'd be willing to create.

Otherwise, there's still a chance that your kernel is running NO_HZ_FULL or at a very low HZ frequency which can make the timer unstable (it should be 300 Hz at least). In the current master branch I've changed the workqueue for this to run on a dedicated high-priority queue. So you may also want to try that: d55e6d4

I'm not sure how to check the model of my XB1S... But since XBE2 still has the rumble problem, there may be another issue not yet uncovered.

Hi, thank you for your time. :)

I checked the model of the controller under the batteries, just included it to make sure it's actually an XB1S controller.

I double checked and I'm running the latest commit from the master branch already.

Checking the permissions:

$ ls -l /dev/hidraw*
crw------- 1 root root 236, 0 Oct 31 15:41 /dev/hidraw0
crw------- 1 root root 236, 1 Oct 31 15:41 /dev/hidraw1
crw------- 1 root root 236, 2 Oct 31 15:41 /dev/hidraw2
crw------- 1 root root 236, 3 Oct 31 15:41 /dev/hidraw3
crw------- 1 root root 236, 4 Oct 31 15:42 /dev/hidraw4

I'm not sure how they're supposed to be, so I also tested after adding /etc/udev/rules.d/99-hidraw-permissions.rules:

KERNEL=="hidraw*", SUBSYSTEM=="hidraw", MODE="0664", GROUP="plugdev"

which makes them:

$ ls -l /dev/hidraw*
crw-rw-r-- 1 root root 236, 0 Oct 31 15:41 /dev/hidraw0
crw-rw-r-- 1 root root 236, 1 Oct 31 15:41 /dev/hidraw1
crw-rw-r-- 1 root root 236, 2 Oct 31 15:41 /dev/hidraw2
crw-rw-r-- 1 root root 236, 3 Oct 31 15:41 /dev/hidraw3
crw-rw-r-- 1 root root 236, 4 Oct 31 15:42 /dev/hidraw4

Also:

$ xpadneo 0005:045E:02FD.0005: input,hidraw4: BLUETOOTH HID v9.03 Gamepad [Xbox Wireless Controller] on 5c:f3:70:85:81:80

My kernel is compiled with:

$ zgrep _HZ /proc/config.gz
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_RCU_FAST_NO_HZ=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_300=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=300

so it should be set for 300Hz, right?

Finally, I tried setting SDL_JOYSTICK_HIDAPI=0 before running steam, but the problem remains.

kakra commented 3 years ago

After some research, I found that this is not a Linux specific bug, it also happens when used in Windows with a Bluetooth connection: https://answers.microsoft.com/en-us/xbox/forum/all/xbox-elite-series-2-disconnecting-from-pc/7e5a964a-16cc-4e83-8dec-aa8c858b875c

kakra commented 3 years ago

I found some more funny bugs while debugging an issue with the XBE2 profile switcher: MS fixed the HID reports with the latest firmware updates and we no longer see a lot of duplicated buttons and axes. Apparently that broke some assumptions we were making over the packet format. That firmware is a real mess. No wonder that it took Windows 95 until Windows 10 to work mostly reliably. ;-)

@solystm Originally, I intended to implement a delay pool for rumble. You may want to try #253 which also includes all the fixes I made on the way. It doesn't help on my computer but maybe it changes something for you. OTOH, I may have just made some stupid mistake and took a too easy route. Please try and report back.

BTW: If the delay pool kicks in, it will log the incident to dmesg.

solystm commented 3 years ago

I tried #253 but it didn't actually work at all, for whatever reason. The controller was recognized (I could see it in jstest) but Trials of Mana, Streets of Rage 4, and Asetto Corsa Competizione all refused to recognize it even after a restart. I swapped back to the mainline version and that worked without issue, so I'm not clear what was going on there.

kakra commented 3 years ago

Does any of those games have a free demo version (preferably for Steam) so I could try?

solystm commented 3 years ago

Yeah, Trials of Mana does: https://store.steampowered.com/app/924980/Trials_of_Mana/

kakra commented 3 years ago

Without revisiting the complete report, it may be a duplicate of #272 if rumble is involved.

But even with ERTM enabled, I see strange problems with the XBE2 controller where it would sometimes stop sending data even without rumble involved. This may still be an issue in the Bluetooth stack and needs further investigation but there's nothing that xpadneo could do about it.

With ERTM disabled, the controller crashes when it receives rumble commands but streaming of input reports works flawlessly. So as long as you don't need rumble, you could disable rumble and disable ERTM and you should be fine.

With ERTM enabled, the controller rumbles just fine and as it should, but it eventually stops sending input reports if you constantly stream reports from it (like circling one of the thumb sticks smoothly and constantly around). It usually backs up again as soon as you stop sending data - but that may take seconds, or even lag input events behind.

If this is purely caused by rumble, please close this report. It will be fixed by upcoming kernels (probably 5.12) and by enabling ERTM then.

solystm commented 3 years ago

Interesting, good to know! I'll look out for 5.12.