larixer / hid-asus-dkms

ASUS HID FTE100* DKMS Driver
GNU General Public License v2.0
68 stars 10 forks source link

Driver submission to kernel tree #16

Closed larixer closed 7 years ago

larixer commented 7 years ago

i2c_hid i2c-FTE1001:00: failed to reset device i2c_hid_get_input: incomplete report (30/2141)

I suspect these errors are happening because of broken HID report descriptor. I'm sure HID report descriptor is not okay on the device, I guess we can mokeypatch it, because many hid kernel modules do so. I'm not 100% sure, though. But I believe we should somehow handle these errors one way or another so that they don't show up in dmesg logs, before including the source code of the driver into main kernel tree.

Victor

redmcg commented 7 years ago

The i2c_hid i2c-FTE1001:00: failed to reset device error message is logged by i2c-hid.c in the i2c_hid_hwreset function. This is called from two places:

1. i2c_hid_parse
 - hid_ll_driver->parse()
 - hid_add_device
 - i2c_hid_probe
 - i2c_driver->probe()

and

2. i2c_hid_resume
 - dev_pm_ops->resume()

So we don't really get an opportunity to intercept that. It certainly happens prior to the parsing of the HID report descriptor. Fortunately it doesn't appear to have any negative impact.

The i2c_hid_get_input: incomplete report (30/2141) occurs when the device sends a 0x5d report with a size greater than 30 (in this case 2,141 bytes).

I can see two options for suppressing it:

  1. Overwrite the HID descriptor with an arbitrarily large buffer size for the 0x5d report; or
  2. Talk to the device directly (i.e. via i2c and bypass the hid framework) and silently discard a report larger than 30 bytes.

I'm not really happy with either approach. The first feels like an ugly hack which results in an oversized buffer with the sole purpose of suppressing an error message. And the second ties our driver to the i2c bus (which as Benjamin states here - could soon find use on the Bluetooth bus: https://github.com/vlasenko/hid-asus-fte-dkms/issues/13#issuecomment-261910462).

So I think these are hardware issues that we might have to live with. There could be other options I'm not seeing. Let me know if you can think of any others. This might also be a good one for Benjamin to weigh in on.

larixer commented 7 years ago

HW reset error happens only on boot. I cannot reproduce it when system booted and I'm reinserting asus_hid_fte. So my guess is that it rather kernel issue, not device issue, I'm not sure whether it should be fixed inside asus_hid_fte, but rather in other parts of the kernel.

I think incomplete report can be suppressed by removing this report from HID report descriptor at all. This is vendor-specific report and kernel has no means to interpret it, so no need to ask for it at all.

larixer commented 7 years ago

Actually, after looking more into this, the incomplete report happens because i2c_hid tries to load hid_generic first on boot and then hid_asus_fte. If I blacklist hid_generic - which is not a solution, of course, the incomplete error doesn't happen, when I reinsmod hid_asus_fte and i2c_hid incomplete error doesn't happen too. Error resetting the device still happens on boot with hid_generic blacklisted.

So when the code will be merged with mainline kernel incomplete report error will be fixed automatically, I assume.

redmcg commented 7 years ago

I think there's two different errors. With hid_generic - you'll get: error in i2c_hid_init_report size:633 / ret_size:0

This happens when hid_generic tries to read all the feature reports. I added the HID_QUIRK_NO_INIT_REPORTS quirk to our driver to stop any attempt to initialise the feature reports.

But even with the hid_asus_fte driver - the device (for reasons unknown) tries to send report 0x5d with more bytes than the expected 30. So you end up with this error: i2c_hid i2c-FTE1001:00: i2c_hid_get_input: incomplete report (30/32544)

redmcg commented 7 years ago

The i2c_hid module has a handy module parameter which allows you to turn on debug:

$ modinfo i2c_hid
filename:       /lib/modules/4.4.0-47-generic/kernel/drivers/hid/i2c-hid/i2c-hid.ko
license:        GPL
author:         Benjamin Tissoires <benjamin.tissoires@gmail.com>
description:    HID over I2C core driver
srcversion:     CF70CB7E241FF729A5F0F2E
alias:          acpi*:PNP0C50:*
alias:          acpi*:ACPI0C50:*
alias:          i2c:hid
depends:        hid
intree:         Y
vermagic:       4.4.0-47-generic SMP mod_unload modversions 
parm:           debug:print a lot of debug information (bool)

I just created the file /etc/modprobe.d/i2c-debug.conf and added:

options i2c_hid debug=1

and then rebooted. Edit: Actually - I had to do an update-initramfs -u first (as i2c_hid is in the initrd).

With this on I was able to see the following:

[    1.004164] hidraw: raw HID events driver (C) Jiri Kosina
[    2.241196] usbcore: registered new interface driver usbhid
[    2.241218] usbhid: USB HID core driver
[   11.961452] hid-multitouch 0003:0457:11AF.0001: input,hiddev0,hidraw0: USB HID v1.11 Device [USBest Technology SiS HID Touch Controller] on usb-0000:00:14.0-7/input0
[   12.157312] i2c_hid i2c-FTE1001:00: Fetching the HID descriptor
[   12.157314] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=01 00
[   12.158183] i2c_hid i2c-FTE1001:00: HID Descriptor: 1e 00 00 01 85 00 02 00 03 00 1e 00 04 00 42 00 05 00 06 00 05 0b 01 01 07 00 00 00 00 00
[   12.158216] i2c_hid i2c-FTE1001:00: entering i2c_hid_parse
[   12.158218] i2c_hid i2c-FTE1001:00: i2c_hid_hwreset
[   12.158219] i2c_hid i2c-FTE1001:00: i2c_hid_set_power
[   12.158220] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=05 00 00 08
[   12.158378] i2c_hid i2c-FTE1001:00: resetting...
[   12.158379] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=05 00 00 01
[   12.158533] i2c_hid i2c-FTE1001:00: __i2c_hid_command: waiting...
[   17.156120] i2c_hid i2c-FTE1001:00: __i2c_hid_command: finished.
[   17.156123] i2c_hid i2c-FTE1001:00: failed to reset device.
[   17.156125] i2c_hid i2c-FTE1001:00: i2c_hid_set_power
[   17.156126] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=05 00 01 08
[   18.159472] i2c_hid i2c-FTE1001:00: i2c_hid_hwreset
[   18.159475] i2c_hid i2c-FTE1001:00: i2c_hid_set_power
[   18.159476] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=05 00 00 08
[   18.159641] i2c_hid i2c-FTE1001:00: resetting...
[   18.159644] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=05 00 00 01
[   18.159799] i2c_hid i2c-FTE1001:00: __i2c_hid_command: waiting...
[   23.159647] i2c_hid i2c-FTE1001:00: __i2c_hid_command: finished.
[   23.159662] i2c_hid i2c-FTE1001:00: failed to reset device.
[   23.159673] i2c_hid i2c-FTE1001:00: i2c_hid_set_power
[   23.159682] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=05 00 01 08
[   24.163630] i2c_hid i2c-FTE1001:00: i2c_hid_hwreset
[   24.163633] i2c_hid i2c-FTE1001:00: i2c_hid_set_power
[   24.163635] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=05 00 00 08
[   24.163793] i2c_hid i2c-FTE1001:00: resetting...
[   24.163794] i2c_hid i2c-FTE1001:00: __i2c_hid_command: cmd=05 00 00 01
[   24.163955] i2c_hid i2c-FTE1001:00: __i2c_hid_command: waiting...
[   24.164787] i2c_hid i2c-FTE1001:00: __i2c_hid_command: finished.
[   24.164790] i2c_hid i2c-FTE1001:00: asking HID report descriptor

You can see the driver attempts to reset the device. In the above log it tried three times.

Section 7.2.1.2 of the hid over i2c protocol spec says:

At the end of the reset, the DEVICE must also write a 2 Byte value to the Input Register with the sentinel value of 0x0000 (2 Bytes containing 0) and must assert the Interrupt to indicate that it has been initialized

I can see this is what the driver is waiting for the device to do. It is waiting 5 seconds before retrying. So it looks like the issue is with the device not asserting the interrupt to indicate that is has been initialized.

larixer commented 7 years ago

Yes, I'm aware of i2c debug, using it too. But if you reload the driver after system has booted, no matter how many times, you don't get reset errors.

larixer commented 7 years ago

@redmcg Brendan, I have reworked scripts and DKMS conf to make module binding to device on boot and during development more reliable. Please see commit: https://github.com/vlasenko/hid-asus-fte-dkms/commit/34a021dfbd0fc7fe78115395b814eb4045012112

With MODULE_ALIAS inside the driver the kernel loads hid_asus_fte and binds device to it on boot for me instead of hid_generic. Without MODULE_ALIAS it loads hid_generic first and binds device to hid_generic.

I believe MODULE_ALIAS line should be present in Linux kernel tree for this driver, too. I will push this change to your repo. Please correct me if I'm wrong.

P.S. Lets maybe hold off until the end of this week with pushing changes to Linux kernel, since we are still making changes?

larixer commented 7 years ago

@redmcg I have commited MODULE_ALIAS to your Linux repo and squashed it to your commit: https://github.com/redmcg/linux/commit/b7a2f4db5cf3a493aadacb096e2c686069404437

I've also renamed in-memory driver name from 'asus' to 'hid-asus', because 'asus' name is not specific enough. On my laptop I have a bunch of asus driver and it's not evident which is which.

larixer commented 7 years ago

@redmcg I've also merged back your changes from Linux kernel tree to DKMS driver and renamed everywhere hid-asus-fte to hid-asus, including the change in the name of this GitHub repository

redmcg commented 7 years ago

With MODULE_ALIAS inside the driver the kernel loads hid_asus_fte and binds device to it on boot for me instead of hid_generic.

Excellent work! I'm not familiar with how that works but I'll take a look at the code to understand. If there was a reference you used please let me know!

My preference would be for the kernel to pick the hid_asus driver first (if it is present) and then fallback to hid_generic - which is what you currently have. But if I add this device to the hid_have_special_driver array then it won't be able to fallback to hid_generic. I'm kind of tempted to remove it from the hid_have_special_driver - but I'm worried the acpi alias might not work in all scenarios (for example: when the BIOS doesn't have ACPI ). So I'm a bit torn about what the best approach is:

  1. Keep it in hid_have_special_driver so it can only bind to hid_asus; or
  2. Remove it and risk scenarios where hid_generic is picked first.

It's probably option 1. It just means you must have the hid_asus module in your kernel build to use this device.

I've also renamed in-memory driver name from 'asus' to 'hid-asus', because 'asus' name is not specific enough.

I agree. I didn't want to change what was already in the kernel - but I actually think this might have been overlooked. All the other hid drivers seem to have the 'hid' prefix.

I've also merged back your changes from Linux kernel tree to DKMS driver and renamed everywhere hid-asus-fte to hid-asus, including the change in the name of this GitHub repository

That's good. We can make sure I didn't break anything in the merge.

Lets maybe hold off until the end of this week with pushing changes to Linux kernel, since we are still making changes?

Agree. I made a submission to the kernel a few years back so I'm having to re-familiarise myself with the process anyway. When I do make the submission - are you happy for me to include you in a Signed-off-by: tag?

larixer commented 7 years ago

Yes, after giving it more thought, I think it's option 1. Though it's certainly not the best way to do the things, it is the way the kernel do the things now and we either need to propose completely new way or stick with current one. I have removed MODULE_ALIAS from kernel driver code, it won't help much for kernel. But for DKMS driver it is still helpful, to reduce the number of cases when device is intercepted by hid-generic first.

Yes, sure, please include me in the "Signed-off-by:" tag. @ain101 We should include you in the "Signed-off-by:" tag. Please confirm that you agree to do it.

ain101 commented 7 years ago

You can include me in the "Signed-off-by:" tag.

redmcg commented 7 years ago

I stumbled across this document: https://msdn.microsoft.com/en-us/windows/hardware/drivers/hid/plug-and-play-support-and-power-management

And I noticed it said that Windows 8 may issue a RESET and retrieve the report descriptor in parallel.

So I tried changing i2c-hid to do the same and it seems to have fixed the failed to reset device issue. I've pushed the change to my fork (as it requires i2c-hid to be in dkms as well - so I didn't want to put it here).

larixer commented 7 years ago

@redmcg Cool. This fixes "failed to reset device" for me too. Could you split commit in your repo into two. With first commit adding i2c_hid source code and the second commit with your modifications to it?

Quite weird behavior that getting report descriptor is mandatory during reset for this device, if this is really true...

redmcg commented 7 years ago

I've split the commits.

Yeah - it is weird. But I was thinking that they wouldn't have let the device go out like that on Windows so when I saw this Microsoft document I thought maybe that could be the difference.

But I just took another look at Frederik's captures and couldn't see any evidence of the RESET and retrieval of the report descriptor happening in parallel.

It's odd.

redmcg commented 7 years ago

@vlasenko 0001-HID-asus-Add-i2c-touchpad-support.patch.txt

Attached is the proposed patch for submission. I'll send it to the following recipients:

And CC yourself and Frederik.

Let me know if you want me to change/add/remove anything and when/if you're happy for me to proceed.

Hopefully it will make it in to Kernel v4.10 and therefore Ubuntu 17.04.

larixer commented 7 years ago

@redmcg Yes, looks good. Please proceed, Brendan!

redmcg commented 7 years ago

It's done! And been accepted!

Thanks once again Victor and Frederick for your all work on this one.

larixer commented 7 years ago

Congratulations guys! Thank you for all the hard work!

aaamourao commented 7 years ago

@redmcg Do you know which version of the Kernel which contains your i2c-hid patch? I'm wondering if I should use your custom i2c-hid or wait for a fedora's kernel upgrade.

I'm using hid-asus-dkms from @vlasenko with a few modifications on the shell scripts.

redmcg commented 7 years ago

I've not submitted an i2c-hid patch. I've just made a DKMS module which can be found in my fork of this repo.

However - it's only required if you're having problems with the TouchPad hardware reset (usually during boot). If that is the case let's continue the conversation over on issue #24 - which is related to that problem. If I do make a submission to the kernel it'll be good to have numbers related to:

  1. who is affected; and
  2. for whom this modification fixes the problem.
larixer commented 7 years ago

@redmcg I think TouchPad hardware reset affects all the users, but the problem occurs very rarely, when reset fails for 3 times and kernel discards handling device. This has happened to me a couple of times. Also, seems Incomplete input reports somehow connected to device reset. I don't see Incomplete input report errors if I use driver from your fork. But I'm still mainly using driver from this repo and most of the time it's okay. But very very rarely when Incomplete report errors generated TouchPad starts behaving crazy, like generating each kind of event, tap, double tap, triple tap, despite me tapping at all, just moving the cursor, this continues for some amount of time like 1 minute or 2 minutes and then everything restores to normal and TouchPad continues to work as usual.

I am not sure we should provide sources of i2c-hid driver in this repo, I'm afraid this can produce another wave of support-kind issues, because people use different Linux kernel versions, I'm not sure patched i2c-hid will work fine with all of them. But if you feel we should try this - let's do it.

redmcg commented 7 years ago

I think it is worth adding to this repo if we plan to make it a submission to the kernel (mainly because this repo gets more traffic than mine - thus it'll be better tested). And I would probably change the code so it acts as a quirk (rather than affect every device using the i2c-hid module).

And if it is affecting all users (and it can lead to the device not working/acting strangely) then maybe it is worth submitting to the kernel.

larixer commented 7 years ago

@redmcg Yes, sounds good! Thank you Brendan!