linux-surface / intel-precise-touch

Linux kernel driver for Intel Precise Touch & Stylus
GNU General Public License v2.0
45 stars 9 forks source link

IPTS on Surface Gen7 #4

Open StollD opened 3 years ago

StollD commented 3 years ago

On the surface gen7 devices, IPTS has undergone several changes:

The binary firmware that was used for previous models does not exist anymore. Due to the changes in the GuC firmware it seems like Intel / Microsoft are now forwarding the data to userspace(?) using HID reports, and parsing it using a daemon (or another driver?). The HID descriptor is also not loaded from a binary anymore and needs to be dumped in Windows to analyze it.

The SET_MODE command will not accept any parameter except for IPTS_SENSOR_MODE_MULTITOUCH. Other parameters will result in the status IPTS_ME_STATUS_INVALID_PARAMS being returned.

Even though we can only initialize the device in "multitouch" mode, once the initialization was finished, the only that that is being sent are HID reports: the ones for singletouch, as well as a new report (with report ID 07). Other than that the device behaves exactly expected for multitouch mode (doorbell is incremented instead of signaling new data via. MEI responses).

This issue will be for tracking the progress on full IPTS support for gen7 surface devices.

We suspect that the new IPTS needs HID feature reports to enable proper multitouch mode. The old driver from intel has some hints towards sending HID reports using a special type of feedback, but some quick tests didn't appear to be working. It also has no proper documentation so it is mostly just guessing how the different parts behave.

Another thing that has no been tested is simply not sending the SET_MODE event and directly sending SET_MEM_WINDOW. Maybe there was a change so that multitouch mode is the default, and sending anything will change it to singletouch mode? (i.e redefining SET_MODE as sth. like SET_SINGLETOUCH_MODE)

StollD commented 3 years ago

Example for the new 07 reports:

ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Buffer 2
07 b6 46 16 00 00 00 00 00 00 0b 00 00 00 00 ff 00 74 00 04 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Buffer 3
07 4c 5f 16 00 00 00 00 00 00 0b 00 00 00 00 ff 00 74 00 04 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Buffer 4
07 e2 77 16 00 00 00 00 00 00 0b 00 00 00 00 ff 00 74 00 04 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
StollD commented 3 years ago

Patch for not sending SET_MODE (untested):

From af13ea8a753cbe3f4e366a8a927c5f1892360843 Mon Sep 17 00:00:00 2001
From: Dorian Stoll <dorian.stoll@tmsp.io>
Date: Wed, 3 Jun 2020 15:41:15 +0200
Subject: [PATCH] no set-mode

---
 receiver.c | 25 ++-----------------------
 1 file changed, 2 insertions(+), 23 deletions(-)

diff --git a/receiver.c b/receiver.c
index ab28399..9884750 100644
--- a/receiver.c
+++ b/receiver.c
@@ -51,7 +51,8 @@ static void ipts_receiver_handle_get_device_info(struct ipts_context *ipts,
 static void ipts_receiver_handle_clear_mem_window(struct ipts_context *ipts,
        struct ipts_response *msg, int *cmd_status, int *ret)
 {
-   struct ipts_set_mode_cmd sensor_mode_cmd;
+   int i;
+   struct ipts_set_mem_window_cmd cmd;

    if (msg->status != IPTS_ME_STATUS_TIMEOUT &&
            msg->status != IPTS_ME_STATUS_SUCCESS) {
@@ -68,25 +69,6 @@ static void ipts_receiver_handle_clear_mem_window(struct ipts_context *ipts,

    ipts->status = IPTS_HOST_STATUS_RESOURCE_READY;

-   memset(&sensor_mode_cmd, 0, sizeof(struct ipts_set_mode_cmd));
-   sensor_mode_cmd.sensor_mode = ipts->mode;
-
-   *cmd_status = ipts_control_send(ipts, IPTS_CMD(SET_MODE),
-           &sensor_mode_cmd, sizeof(struct ipts_set_mode_cmd));
-}
-
-static void ipts_receiver_handle_set_mode(struct ipts_context *ipts,
-       struct ipts_response *msg, int *cmd_status)
-{
-   int i;
-   struct ipts_set_mem_window_cmd cmd;
-
-   if (msg->status != IPTS_ME_STATUS_SUCCESS) {
-       dev_err(ipts->dev, "0x%08x failed - status = %d\n",
-               msg->code, msg->status);
-       return;
-   }
-
    memset(&cmd, 0, sizeof(struct ipts_set_mem_window_cmd));

    for (i = 0; i < 16; i++) {
@@ -206,9 +188,6 @@ static int ipts_receiver_handle_response(struct ipts_context *ipts,
        ipts_receiver_handle_clear_mem_window(ipts, msg,
                &cmd_status, &ret);
        break;
-   case IPTS_RSP(SET_MODE):
-       ipts_receiver_handle_set_mode(ipts, msg, &cmd_status);
-       break;
    case IPTS_RSP(SET_MEM_WINDOW):
        ipts_receiver_handle_set_mem_window(ipts, msg, &cmd_status);
        break;
-- 
2.26.2
StollD commented 3 years ago

My earlier experiments with sending HID GET_FEATURE reports using host2me feedback:

https://gist.github.com/StollD/89f1c8d094c41e6ab5e3bbbe0dd0335b

Raw HID requests using host2me (aka. hid2me) feedback from the old Intel driver:

https://github.com/ipts-linux-org/ipts-linux-new/blob/master/drivers/misc/ipts/ipts-hid.c#L221

xanf commented 3 years ago

Unfortunately it does not work without SET_MODE on Surface Book 3:

[  171.925322] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Probing IPTS
[  171.925324] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: IPTS using DMA_BIT_MASK(64)
[  171.928265] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Device 045E:09B1 found
[  171.930352] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: 0x80000003 failed: 1
xanf commented 3 years ago

@StollD I've set up qemu windows 10 and forwarded IPTS device to windows 10

I can see data flowing when I touch screen, vfio_region_read / vfio_region_write calls. Could you probably give me a hint what should I look. raw HID protocol in this writes?

StollD commented 3 years ago

Nice!

We want to figure out the initialization sequence that the new IPTS uses, so we need to look for that. It will be encapsulated by an unknown amount of wrapper data, because IPTS doesn't talk to the PCI device that you stubbed directly, it talks to the virtual MEI bus. There is a IPTS specific UUID, so we might be able to look for that.

Can you upload an example of the calls that are logged? Never messed with this myself, would be useful to know how that looks like.

xanf commented 3 years ago

Sure. This one is unprocessed, it should contain both initialization and several pen/finger touches

trace.gz

StollD commented 3 years ago

Alright, there are some interesting things.


https://gist.github.com/StollD/e2fa122521bef3241782ef827285d998#file-trace-001-L3049-L3053

This is the command that sets the mode of the IPTS hardware. The parameters are all zeros, which, on the old IPTS, meant singletouch mode. If the first zero would be a one, it would be multitouch mode.

If windows initializes the hardware in singletouch mode and gets multitouch events, and we initialize it in multitouch mode and get singletouch, it might be that Intel swapped the meaning of these two values. However, the device otherwise seems to behave exactly like the old singletouch mode: It doesn't signal a new report by increasing the number in the doorbell buffer but uses the READY_FOR_DATA event.

For example:

One issue with that theory though is that we already tested it: When the SL3 was just released, @archseer and I tried exactly that (swapping the meaning of single- and multitouch). The hardware returned an invalid-params error. Though it is possible that that was changed again in the SB3, or in a firmware revision. I also don't know what @archseer exactly changed back then, so it probably won't hurt to try it again.


https://gist.github.com/StollD/e2fa122521bef3241782ef827285d998#file-trace-001-L2874-L2878

IPTS sends a 0x9 command as part of the initialization sequence. According to the old driver from intel, 0x9 means SET_POLICIES. It's parameters doesn't seem to have anything to do with singletouch vs. multitouch, but there are some reserved fields, so maybe it was expanded to cover this.


https://gist.github.com/StollD/e2fa122521bef3241782ef827285d998#file-trace-001-L3001-L3007

There is a new 0xf command that is not documented in the old intel driver. Could be related, although I doubt it. It seems to carry a memory address as a parameter.


Feedback sending is...weird. It only seems to be sent for buffer 0xA and 0x2, and there are two instances of "special" feedback (with buffer ID 0x10).

That special feedback could be used to send HID feature reports to the hardware, but there is no way for us to know, because these are stored in a seperate buffer and don't go over the PCI bus. But we also don't know if that special feedback is new, because we have no trace from a gen4-6 device to compare against.

xanf commented 3 years ago

@StollD I can get the same trace from gen6 (I believe) device (surface book 2) if it helps

Also if it will help in research I could record session with "pen only", "multitouch" and "touch"

StollD commented 3 years ago

@StollD I can get the same trace from gen6 (I believe) device (surface book 2) if it helps

That would be great actually! I was planning to research how to setup the passthrough myself, but if you already know how it works, it will probably be faster.

Also if it will help in research I could record session with "pen only", "multitouch" and "touch"

Not really. What I call multitouch covers all three of these. It is simply how we (and intel) call the mode that IPTS runs in. And windows doesn't allow changing it AFAIK.


Regarding my multitouch vs. singletouch theory: It should actually be enough if you loaded the master branch of this repo with the singletouch option set (which is also compiled into our builds already).

$ sudo modprobe -r ipts
$ sudo modprobe ipts singletouch debug

Then simply try to use the pen (or touch the screen), and watch dmesg.

You actually need to use the master version though, the UAPI version for iptsd won't work, because it doesn't support changing the mode.

xanf commented 3 years ago

@StollD unfortunately no reaction on pen / multitouch with singletouch option (i see logs on single touches so it was loaded correctly)

StollD commented 3 years ago

Out of curiosity could you post the dmesg? Maybe I can find some clue in there.

If simply changing the parameter for SET_MODE doesn't work, I think the next step would be to look at that SET_POLICY command a bit more. I can write a patch so that the driver will send exactly the command you captured during initialization.

xanf commented 3 years ago

Sure! dmesg.log

I've also tried to record another very short session trace on qemu, touching device only with pen, that 0x10 are still there, exactly 2 :( trace.log.gz

I will be able to get logs from Surface Book 2 somewhere this week

xanf commented 3 years ago

Hmm. I'm actually very confused of size for trace.log.gz - it was a very short session of qemu (less than a minute). Why is it so big :/

StollD commented 3 years ago

I've also tried to record another very short session trace on qemu, touching device only with pen, that 0x10 are still there, exactly 2 :(

Well, its part of the device initialization. That doesn't care about what tool you want to use on the device. ;) The actual device data is passed using memory buffers, and not through the PCI interface, so you won't see a difference if you change the tool (stylus / finger).

Hmm. I'm actually very confused of size for trace.log.gz - it was a very short session of qemu (less than a minute). Why is it so big :/

The IPTS interface is just one function of the PCI device you are tracing. It could provide other interfaces that are used by windows that we don't really know about.

But I am confused about a thing too: Why is the linux driver logging singletouch events, if you added the singletouch option? Changing the mode should either produce an error (like when @archseer tried), or change the behaviour. IPTS shouldn't just ignore it.

xanf commented 3 years ago

Double checked again, master branch built, installed via dkms, specified in modprobe.d/ipts.conf

options ipts singletouch=1 debug=1

I see logs, I see reaction to single touches, no pen, no multitouch

StollD commented 3 years ago

Ok. Maybe the parameter has actually lost all its meaning.

Btw, I guess I know why the feedback that was sent looks weird to me. On linux we often had issues, especially on older devices, where sending feedback too often would cause the device to crash. Maybe on windows they only send feedback for every eight input (every time you read from buffer 2 or 10) to mitigate these issues and prevent bad firmware from crashing the device.

That would actually be a pretty smart thing to do, so we might want to replicate that behaviour under linux, for all devices.

StollD commented 3 years ago

I wrote a patch that sends the same SET_POLICIES command that is being sent in your trace. It seems to work fine on SB2. I would be curious what would happen if you apply this to the master branch, and load the module with the debug parameter set?

diff --git a/control.c b/control.c
index 9179eca..1bc00a0 100644
--- a/control.c
+++ b/control.c
@@ -59,6 +59,7 @@ int ipts_control_send_feedback(struct ipts_context *ipts,

 int ipts_control_start(struct ipts_context *ipts)
 {
+   u8 data[16];
    ipts->status = IPTS_HOST_STATUS_INIT;

    if (ipts_params.singletouch)
@@ -66,7 +67,12 @@ int ipts_control_start(struct ipts_context *ipts)
    else
        ipts->mode = IPTS_SENSOR_MODE_MULTITOUCH;

-   return ipts_control_send(ipts, IPTS_CMD(NOTIFY_DEV_READY), NULL, 0);
+   memset(data, 0, 16);
+   data[0] = 0x12;
+   data[2] = 0x64;
+   data[4] = 0x1E;
+
+   return ipts_control_send(ipts, IPTS_CMD(SET_POLICIES), data, 16);
 }

 void ipts_control_stop(struct ipts_context *ipts)
diff --git a/protocol/events.h b/protocol/events.h
index f8b771f..b50d399 100644
--- a/protocol/events.h
+++ b/protocol/events.h
@@ -24,6 +24,7 @@ enum ipts_evt_code {
    IPTS_EVT_FEEDBACK,
    IPTS_EVT_CLEAR_MEM_WINDOW,
    IPTS_EVT_NOTIFY_DEV_READY,
+   IPTS_EVT_SET_POLICIES,
 };

 #endif /* _IPTS_PROTOCOL_EVENTS_H_ */
diff --git a/receiver.c b/receiver.c
index ab28399..c0f60c2 100644
--- a/receiver.c
+++ b/receiver.c
@@ -10,6 +10,20 @@
 #include "protocol/responses.h"
 #include "resources.h"

+static void ipts_receiver_handle_set_policies(struct ipts_context *ipts,
+       struct ipts_response *msg, int *cmd_status)
+{
+   if (msg->status != IPTS_ME_STATUS_SENSOR_FAIL_NONFATAL &&
+           msg->status != IPTS_ME_STATUS_SUCCESS) {
+       dev_err(ipts->dev, "0x%08x failed - status = %d\n",
+               msg->code, msg->status);
+       return;
+   }
+
+   *cmd_status = ipts_control_send(ipts,
+           IPTS_CMD(NOTIFY_DEV_READY), NULL, 0);
+}
+
 static void ipts_receiver_handle_notify_dev_ready(struct ipts_context *ipts,
        struct ipts_response *msg, int *cmd_status)
 {
@@ -196,6 +210,9 @@ static int ipts_receiver_handle_response(struct ipts_context *ipts,
    int ret = 0;

    switch (msg->code) {
+   case IPTS_RSP(SET_POLICIES):
+       ipts_receiver_handle_set_policies(ipts, msg, &cmd_status);
+       break;
    case IPTS_RSP(NOTIFY_DEV_READY):
        ipts_receiver_handle_notify_dev_ready(ipts, msg, &cmd_status);
        break;
xanf commented 3 years ago

No effect unfortunately. Still modprobes, still reports single touches only :cry: (with or without singletouch option)

StollD commented 3 years ago

Damm. I had really hoped that would work. The other new command (the 0xf one) seems to contain a memory address, so sending that ourselves might not be that simple.

It's either that, or changing the modes really happens through special feedback, which will be pretty hard for us to replicate.

xanf commented 3 years ago

Sad. I'm trying to get myself familiar with dtrace for windows introduced by Microsoft not so long ago - it works extremely low level so theoretically it should be possible to trace special feedback function calls in windows

xanf commented 3 years ago

Additionally, I've put iaPreciseTouch.sys in my IDAPro64 and confused - I was expecting to see number of HID functons, so theoretically it will be possible to debug them with kernel debugger, but the only imported id function is HidNotifyPresence :/ I see IOCTL_HID_WRITE_REPORT invocation, so it definitely uses that one

xanf commented 3 years ago

Sorry for "blogging". It seems that initial assumption about additional DLL for processing is correct. Surface is installing TouchPenProcessor.dll which is after quick look in disassembly imports non-typical for driver math functions (like log, pow, etc.) and it is named like "HEAT-compliant touchscreen" and has functions like CreateHeatProcessor

Unfortunately this DLL also includes just HidP_GetCaps and HidP_GetValueCaps calls

xanf commented 3 years ago

Some new dumps from surface book 2

The first one is pretty funny. I've just copied qemu image with my surface book 3 driver to surface book 2. It initialized, but reported only single touches. Also I've noticed that same updates arrive when we're in (for example) boot sequence, so it seems like it's just a default mode for touch screen trace-sb2-with-sb3-driver.gz

Second one is with sb2 driver trace-clean.gz

StollD commented 3 years ago

Thank you!

Your trace-clean log confuses me a bit. Were you able to use the pen with that? Because the commands in the log are the ones that are used for singletouch.

StollD commented 3 years ago

I have played with it a bit myself, and I don't think it will be possible to enable multitouch mode on gen4-6 within the VM, at least not with a custom build of QEMU.

Multitouch mode on the old IPTS depends on binary firmwares that are run on the GPU. The firmware to use is selected by querying the name of the \_SB.TSML ACPI device, which does not exist on the VM. We would need to hack it in there by modifying qemu.

But even then it will probably not work, because we don't pass through the GPU, so the driver won't be able to process touch data there.

Both, the TSML ACPI device, and the touch processing on the GPU are not used on gen7 though. So it should be less of an issue there.

Just to be sure, @xanf when you tried it on gen7, was the hardware actually in multitouch mode? If it is in multitouch mode, either the pen works, or nothing works (because no touch processing driver is installed). If it detects single touches, and nothing else, it is in singletouch mode, which would match the results from our earlier experiments on linux.

StollD commented 3 years ago

Also, I would be curious about how your VM was set up. I just made a generic windows install, and tried to install the surface driver package. But that refused to run, because it didn't detect an actual SB2. I resorted to unpacking the installer and installing the drivers by hand, but there are so many, and we have no idea what is really needed. Also, at least on SB2, the installer seems to do some more setup in the registry, so we loose that too.

I am wondering if we might get better results if we create a disk image of an actual windows install, where all these drivers are installed properly, and use that for the VM. I've seen that microsoft has recovery images for the surfaces, maybe that can also be of use.

xanf commented 3 years ago

On surface book 3 system was reacting with reads / writes in log to pen, I've checked this

I've also resorted to manual installation of unpacked driver, for surface book 3 it seems two devices are sufficient - intel precise touch driver (and installing this alone enables touchscreen to "receive" pen events and something ( i don't remember exact name, does not have sb3 right now with me) called surface touch processor - it was exactly the device which had non-typical log, sin, cos functions in disassembly.

I've figured required driver by disabling them one by one in windows and checking when touch stops working

qzed commented 3 years ago

I think on the SB3, access to the GPU shouldn't be required as it looks like touch processing happens in the TouchPenProcessor.dll (the one with the log/sin/cos functions). On the SB2, the iGPU is required but I think (with some amount of work) it should be possible to pass it through via IOMMU, at least on the units with a dGPU. Probably requires you to run either no desktop at all or the desktop via the dGPU on the host (not sure if it's even possible without a dGPU, this seems to indicate it is).

Regarding ACPI, its possible to add/load additional ACPI tables in qemu (should be -acpitable parameter), so you don't have to modify qemu directly. However, you might not be able to directly copy the ACPI table and have to just write the relevant parts of it yourself instead (in case of dependencies etc.).

StollD commented 3 years ago

On surface book 3 system was reacting with reads / writes in log to pen, I've checked this

@xanf Sorry if I sound like a broken record. But did the pen work (inputs getting registered inside the VM)? Alternatively, did nothing work, not even single touches?

Basically, I want to figure out if the driver in the VM runs actually runs in multitouch mode, or in singletouch mode. Because if it is in singletouch mode, it wouldn't be suprising that no command that we find in the trace logs helps with getting into multitouch mode.

Sadly just looking at the trace doesn't really help in figuring this out. The IPTS commands are just a very small part of the trace log. And even then, they could be anything. You need to check how the driver behaves. And using the pen is a good test for that, because these events will only arrive in multitouch mode. So if the pen works, and the VM picks up the inputs, its in multitouch mode.

Same for "no input at all": This would mean that the driver sets the device to multitouch mode, but then some other part of the data processing stack for multitouch is missing, so that the inputs cannot be processed. Since we only care about the PCI trace, this would be good enough too.

But if only single touches are registered, the device is in singletouch mode, and the trace logs won't help us with getting the device into multitouch mode. In that case we would have to install additional drivers, until something changes.

xanf commented 3 years ago

@StollD sorry for long delay Yes, I can confirm that when touching with just pen I have logs in trace trace.zip

StollD commented 3 years ago

Yes, I can confirm that when touching with just pen I have logs in trace

Can you draw with the pen in the VM? i.e. open OneNote and draw some random stuff. Or even just try to open start menu with it. (Ignore the trace completely please! I only need to know how the VM reacts)

Sorry again, I know I am annoying. Maybe I am just dumb and misunderstand you. But I really want to figure out what exactly I am looking at. Because what I see in the traces looks like singletouch mode to me. And I really want to make sure that is not the case, because otherwise we are just wasting our time (because singletouch logs will contain nothing that helps us getting multitouch mode to work).

Also, what events are you tracing in qemu? Maybe there are some missing events we could add to see what gets written into the feedback buffers.

xanf commented 3 years ago

@StollD no need to excuse :) I'm pretty lame at such kind of things :)

Well, at least now nothing is working. I can see that "trace" lines jumping even when I move my pen in proximity of screen, but I can't make any touch reported to windows (which is weird :/) - neither single touch or multitouch. Devices in device manager are reported as green so very confusing

StollD commented 3 years ago

Hmm... To me that seems like some other driver might be missing, either because it is not installed, or because it isn't probed (i.e. it is attached to some ACPI device that is not present in the VM). We know that there are other drivers involved for multitouch processing. If these are not present, the driver should fall back to singletouch mode (because, well, its a fallback mode).

Since the driver seems to be sending feedback according to your trace log, maybe it also needs some of these drivers for singletouch processing? Or it initialized into a broken multitouch mode, ignoring the available fallback. But I really hope that it went into singletouch mode. Because if these traces show multitouch mode, our only chance would be figuring out whats being sent in the feedback buffers. And I dont know if qemu can trace that. Maybe if we spam the iommu module with debug logging...

I am starting to run out of ideas to be honest. What we could try is to install windows on bare metal, install all the driver through the official MSI installer, and then image that install and use it in qemu. That would at least ensure that all required drivers are installed (maybe the recovery images that Microsoft has for surfaces could also help with that). But it's also a ton of work.

xanf commented 3 years ago

@StollD So... I've followed your way... kind of


Disclaimer! NEVER do this to your laptop. No one on this planet could predict consequences of this one to your Windows installation

Since I have windows in dual-boot I just ran windows in qemu with -drive file=/dev/nvme0n1,format=raw,media=disk using my ACTUAL windows installation to boot.

It was able to boot and I still see IPTS device in device tree, it reports "pen and touch support with 10 points", I still see that "reads and writes" in logs, but no pen or touch input at all. All devices report as healthy in device manager :cry:

Disabling IPTS device in windows stop that "reads & writes" but still not even a single touch

xanf commented 3 years ago

Another wild try for me will be to play with approach similar to hacky patch https://github.com/jcs/qemu/commit/7c1f83b6d700fc2733e0964a1a35383e71ecb838 (see https://jcs.org/2018/11/12/vfio) for background - basically when specific read/write happens - do cpu_physical_memory_read to dump... HID report I believe? Of course the question is which memory should we dump (you've mentioned 0xf probably carries a memory address - I'm going to hunt somewhere there)

StollD commented 3 years ago

Booting your physical install in qemu is a pretty good idea. Do you happen to have the trace log for that? I am not really expecting any differences, but I am grasping for straws at this point, lol.

Regarding patching qemu: It would be indeed interesting to see what the driver writes into the feedback buffers (via DMA). Since these addresses are not static, and will change every time you run the driver, I would just dump everything. If its too hard to look up the address in the log afterwards, and then grep for it, we can always try something else.

StollD commented 3 years ago

I also had another idea: What if the hardware is trying to outsmart the driver here?

On older gens, IPTS uses GuC submission for touch processing. It bridges the ME and the GuC with a special buffer, the doorbell. That buffer simply contains a 32bit integer that gets incremented every time a touch buffer is filled. Because its shared between ME and GuC, the driver doesn't have to act as a bridge, its all done in firmware.

Now, on gen7, the GuC processing was removed, and touch processing now happens in userspace. But it is still possible to assign a doorbell buffer through the IPTS command interface. And on linux we do assign it, simply because there is no special code for gen7.

But what if that tells the hardware that the driver is not able to handle it properly? Gen7 has no chance of doing GuC submission, because the firmware interface doesn't exist anymore. The firmware might do something like this:

Driver requests singletouch mode: Everything is fine, give them singletouch mode.

Driver requests multitouch mode, without sending a doorbell: Okay, the driver is updated, give them multitouch data, but with the control flow for singletouch (which is command / response based and doesnt use the doorbell).

Driver requests multitouch mode, and sends a doorbell buffer: Oh damm, they must be running an older driver that tries to setup GuC submission. This can't work, better give them singletouch data, because one finger is better than no touch at all.

@xanf I wrote a pretty basic patch that stops the driver from setting the doorbell, and adds the bare minimum thats required for the other control flow. Could you give this a try? You need the latest release / master branch of both, the kernel module and iptsd.

diff --git a/receiver.c b/receiver.c
index 3660a1d..90a12ae 100644
--- a/receiver.c
+++ b/receiver.c
@@ -56,11 +56,11 @@ static int ipts_receiver_handle_set_mode(struct ipts_context *ipts)
            upper_32_bits(ipts->feedback[i].dma_address);
    }

-   cmd.workqueue_addr_lower = lower_32_bits(ipts->workqueue.dma_address);
-   cmd.workqueue_addr_upper = upper_32_bits(ipts->workqueue.dma_address);
+   cmd.workqueue_addr_lower = 0;
+   cmd.workqueue_addr_upper = 0;

-   cmd.doorbell_addr_lower = lower_32_bits(ipts->doorbell.dma_address);
-   cmd.doorbell_addr_upper = upper_32_bits(ipts->doorbell.dma_address);
+   cmd.doorbell_addr_lower = 0;
+   cmd.doorbell_addr_upper = 0;

    cmd.host2me_addr_lower = lower_32_bits(ipts->host2me.dma_address);
    cmd.host2me_addr_upper = upper_32_bits(ipts->host2me.dma_address);
@@ -82,6 +82,19 @@ static int ipts_receiver_handle_set_mem_window(struct ipts_context *ipts)
    return ipts_control_send(ipts, IPTS_CMD_READY_FOR_DATA, NULL, 0);
 }

+static int ipts_receiver_handle_ready_for_data(struct ipts_context *ipts)
+{
+   u32 *doorbell = (u32 *)ipts->doorbell.address;
+
+   *doorbell = *doorbell + 1;
+   return 0;
+}
+
+static int ipts_receiver_handle_feedback(struct ipts_context *ipts)
+{
+   return ipts_control_send(ipts, IPTS_CMD_READY_FOR_DATA, NULL, 0);
+}
+
 static int ipts_receiver_handle_clear_mem_window(struct ipts_context *ipts)
 {
    if (ipts->restart)
@@ -146,6 +159,12 @@ static void ipts_receiver_handle_response(struct ipts_context *ipts,
    case IPTS_RSP_SET_MEM_WINDOW:
        ret = ipts_receiver_handle_set_mem_window(ipts);
        break;
+   case IPTS_RSP_READY_FOR_DATA:
+       ret = ipts_receiver_handle_ready_for_data(ipts);
+       break;
+   case IPTS_RSP_FEEDBACK:
+       ret = ipts_receiver_handle_feedback(ipts);
+       break;
    case IPTS_RSP_CLEAR_MEM_WINDOW:
        ret = ipts_receiver_handle_clear_mem_window(ipts);
        break;
danielzgtg commented 3 years ago

I'm not xanf, but I tried your patch on 0a4a44c and https://github.com/linux-surface/iptsd/commit/61828c0198d7da799e7eb1a8e7319ad4d9423cca out-of-tree with 5.9.1-surface and it didn't work.

I did the following and touch stopped working:

make all
service iptsd stop
rmmod ipts
insmod ipts.ko
service iptsd start

Touch started working again after reloading the original driver:

service iptsd stop
rmmod ipts
modprobe ipts
service iptsd start
`dmesg` ``` [18233.943931] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Stopping IPTS [18238.844280] ipts: loading out-of-tree module taints kernel. [18238.844383] ipts: module verification failed: signature and/or required key missing - tainting kernel [18238.858080] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Starting IPTS [18238.861869] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Command 0x80000003 failed: 1 [18265.184030] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Stopping IPTS [18295.319906] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Starting IPTS [18295.324595] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Device 045E:099F ready [18300.210372] input: IPTS Touch as /devices/virtual/input/input26 [18300.210450] input: IPTS Stylus as /devices/virtual/input/input27 ```
`sudo ./iptsd` ``` no such device main.ioctl /home/home/Downloads/iptsd/ioctl.go:13 main.(*IptsControl).SendFeedbackFile /home/home/Downloads/iptsd/control.go:156 main.(*IptsControl).Flush /home/home/Downloads/iptsd/control.go:181 main.(*IptsControl).Start /home/home/Downloads/iptsd/control.go:57 main.main /home/home/Downloads/iptsd/main.go:46 runtime.main /usr/lib/go-1.13/src/runtime/proc.go:203 runtime.goexit /usr/lib/go-1.13/src/runtime/asm_amd64.s:1357 panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4ed499] goroutine 1 [running]: main.(*IptsDevices).Destroy(0xc00008c2d0, 0xc0000d2000, 0x0) /home/home/Downloads/iptsd/devices.go:191 +0x29 main.Shutdown(0xc00008c2a0) /home/home/Downloads/iptsd/main.go:22 +0x2f main.HandleError(0xc00008c2a0, 0x55ebe0, 0xc0000d0020) /home/home/Downloads/iptsd/main.go:29 +0xa4 main.main() /home/home/Downloads/iptsd/main.go:48 +0x7c1 ```

Edit: Interestingly, I get the following after building the module without the patch, then loading and unloading it (touch works in this case):

[18766.779214] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Starting IPTS
[18766.784082] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Device 045E:099F ready
[18782.558115] input: IPTS Touch as /devices/virtual/input/input28
[18782.558244] input: IPTS Stylus as /devices/virtual/input/input29
[18796.622848] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Stopping IPTS
[18796.874269] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Command 0x80000007 failed: 14
StollD commented 3 years ago

Ok, so the command that sets the doorbell returns an invalid params error when the doorbell is missing. Back to the drawing board it is then.

The other error seems to be a timeout error from the command that shuts down the IPTS hardware. That seems weird, but not too worrying (unless it happens always, then we should look into it at some point).

danielzgtg commented 3 years ago

It was able to boot and I still see IPTS device in device tree, it reports "pen and touch support with 10 points", I still see that "reads and writes" in logs, but no pen or touch input at all. All devices report as healthy in device manager cry

Even Intel ME? I would have thought that it would be difficult to virtualize properly, and it is critical for IPTS

I am starting to run out of ideas to be honest

I am thinking of pretending to be Windows in other ways. I know about and tried adding acpi_osi="Windows 2020" to the kernel cmdline but that didn't have any effect

qzed commented 3 years ago

I know about and tried adding acpi_osi="Windows 2020" to the kernel cmdline but that didn't have any effect

That should only influence ACPI behavior. Also I'd think MS is (hopefully) over those dirty tricks by now.

StollD commented 3 years ago

Another month, another idea!

I finally bit into the sour apple and spent a bit of time with the windows drivers for SP7 in Ghidra. And I managed to find the function that prepares and sends the SET_MODE command. And inside I found the structure that gets sent as the payload for the SET_MODE command.

I also decompiled the same function from the SB2 driver for a reference. Both looked absolutely identical. But then I noticed something...

The payload struct is reversed??

Like, this is how the struct gets decompiled (names added by me):

undefined4 reserved3;
undefined4 reserved2;
undefined4 reserved1;
undefined4 mode;
undefined4 command;

Where the driver for SB2 does a write to mode, the driver from SP7 writes to reserved3. Where the SB2 one writes to reserved1 the other writes to reserved2. And so on. I decompiled two other drivers (from SP4 and SL3) to be absolutely sure, and they looked the same.

So, theory: Someone at intel thought it would be a good idea to do this:

diff --git a/protocol.h b/protocol.h
index 2e179cb..be2409d 100644
--- a/protocol.h
+++ b/protocol.h
@@ -210,8 +210,8 @@
  * requires a different control flow that is not implemented.
  */
 struct ipts_set_mode_cmd {
-   u32 mode;
    u8 reserved[12];
+   u32 mode;
 } __packed;

 /**

@xanf, @danielzgtg and really anyone who is curious: Could you try to apply the patch above to the current master branch and see what happens? If the driver loads without errors thats probably a good step in the right direction already.

buchwasa commented 3 years ago

I applied your patch and it loads without errors.

StollD commented 3 years ago

@buchwasa Thanks for trying! That it loads without errors seems promising.

I pushed the code for a debugging tool to the iptsd repository. Could you clone the repo, build it, and then run the tool?

$ git clone https://github.com/linux-surface/iptsd
$ cd iptsd
$ meson build
$ ninja -C build
$ sudo ./build/ipts-dbg > ipts.log

After starting the app please touch the screen a few times (and / or touch it with the stylus if you have one), kill the tool and upload the ipts.log file that was created. That way we can check if we receive any data from ipts, and if yes what that data looks like.

buchwasa commented 3 years ago

Sure thing, there is no other data in the log other than the following:

Vendor:       045E
Product:      099F
Version:      19088656
Buffer Size:  16384
Max Contacts: 10
StollD commented 3 years ago

Ok, interesting. So we are not actually getting any data from the hardware. Can you apply this patch to the module and try it again? And then post your dmesg instead of the ipts.log

diff --git a/receiver.c b/receiver.c
index 3660a1d..f696175 100644
--- a/receiver.c
+++ b/receiver.c
@@ -133,6 +133,8 @@ static void ipts_receiver_handle_response(struct ipts_context *ipts,
 {
    int ret;

+   dev_info(ipts->dev, "RSP: %d\n", rsp->code);
+
    if (ipts_receiver_handle_error(ipts, rsp))
        return;
buchwasa commented 3 years ago

Here ya go:

[    4.821949] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Starting IPTS
[    4.877308] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: RSP: -2147483647
[    4.877711] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: RSP: -2147483646
[    4.878561] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: RSP: -2147483645
[    4.878563] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: Device 045E:099F ready
[    5.390749] ipts 0000:00:16.4-3e8d0870-271a-4208-8eb5-9acb9402ae04: RSP: -2147483643
StollD commented 3 years ago

So that last message shows that the READY_FOR_DATA command returned (I messed up the formatting, should have been 80000005), which indicates that a command is used to notify about new data, instead of incrementing the doorbell. Luckily we can relatively easily translate between both, the event and the doorbell.

diff --git a/receiver.c b/receiver.c
index 3660a1d..58e5bf8 100644
--- a/receiver.c
+++ b/receiver.c
@@ -128,6 +128,14 @@ static bool ipts_receiver_handle_error(struct ipts_context *ipts,
    return true;
 }

+static int ipts_receiver_handle_ready_for_data(struct ipts_context *ipts)
+{
+   u32 *doorbell = (u32 *)ipts->doorbell.address;
+   *doorbell = *doorbell + 1;
+
+   return ipts_control_send(ipts, IPTS_CMD_READY_FOR_DATA, NULL, 0);
+}
+
 static void ipts_receiver_handle_response(struct ipts_context *ipts,
        struct ipts_response *rsp)
 {
@@ -146,6 +154,9 @@ static void ipts_receiver_handle_response(struct ipts_context *ipts,
    case IPTS_RSP_SET_MEM_WINDOW:
        ret = ipts_receiver_handle_set_mem_window(ipts);
        break;
+   case IPTS_RSP_READY_FOR_DATA:
+       ret = ipts_receiver_handle_ready_for_data(ipts);
+       break;
    case IPTS_RSP_CLEAR_MEM_WINDOW:
        ret = ipts_receiver_handle_clear_mem_window(ipts);
        break;

The patch is a bit crude, but it should work fine for taking a look at the actual data. Could you try that and see if you get any outputs from ipts-dbg? I also pushed a smaller fix for the debugging tool to the iptsd repository, would be great if you could pull that.

buchwasa commented 3 years ago

Here ya go, there's some data in here now. ipts.log