linux-surface / intel-precise-touch

Linux kernel driver for Intel Precise Touch & Stylus
GNU General Public License v2.0
49 stars 10 forks source link

Reverse-Engineering Gen7+ Pen Data Format #14

Open qzed opened 3 years ago

qzed commented 3 years ago

This is a technical discussion for the gen7+ pen data format. Please don't post any issues/problems in this thread and keep this a discussion about technical aspects/the protocol.

Some information from other threads:

More info about the protocol can also be found in https://github.com/linux-surface/intel-precise-touch/issues/4.

haymanpf commented 3 years ago

I'll just contribute the little detective work I've been able to do in my spare time playing with my SP7 and Renaisser pen.

I've found three HID report types that consistently show up when the pen is in use, pressed to the screen or just hovering, doesn't matter: 0x0D, 0x1A, and 0x1C. (There are also two others that show up with or without the pen, 0x07 and 0x08, which I think are some sort of noise? They have an ipts_data size of just 64 bytes).

The first and third are very similar, they show up almost the same number of times and have similar ipts_data sizes, 1500 and 2000, respectively. The second ID report shows up about half the time but has a lot more data, it has an ipts_data size of 7488. As an example of frequency, one data set I generated had 330 instances of 0D, 329 instances of 1C, and only 145 instances of 1A. They appear to show up in the order 1C, 0D, 1A (when the last one is there, and assuming I understand the buffer entry correctly).

Two quick guesses: 1) the report types don't seem to line up with any of the IDs posted before, so I'm guessing the other report IDs might be in these data structures somewhere? 2) the 1C and 0D structures are similar in size and always show up next to each other, so maybe these are the RX and TX data from those papers?

I'll attach a couple of binary dumps for each of the report types (data1 is 0D, data2 is 1A and data3 is 1C).

data1-4.bin.txt data1-14.bin.txt data2-4.bin.txt data2-14.bin.txt data3-4.bin.txt data3-14.bin.txt

qzed commented 3 years ago

1) the report types don't seem to line up with any of the IDs posted before, so I'm guessing the other report IDs might be in these data structures somewhere?

That's correct. The HID report IDs are mostly useless: HID reports have fixed size, but the (digitizer) "heatmap" data containing all information can be of variable length. So what they did is simply specify a couple of HID reports all containing HID digitizer heatmap data, each with a different length. That means that the controller sending us the data just chooses whatever report ID best fits the data, i.e. whatever report ID has the next larger size, and pads the rest with zeros.

The IDs linked above are sub-IDs inside the heatmap data. Here's a parser for that data (see usage e.g. here) and here are some partially annotated data dumps of the heatmap data. Note that the naming is a bit different: I've called the "reports" in the parser and issue p-frame in the data dumps. Also the report IDs are two bytes as far as I can tell, but from reverse engineering the DLL there's only one byte so the first byte (usually 0x04) may be some bit-field indicating some status or may be a parent category or something. For example 0x0425 is what we currently parse as actual touch heatmap data, not to be confused with digitizer heatmap data which is everything inside the HID report, so 0x04 would be maybe the status/parent thing and 0x25 the actual ID. From the parser notebook I've linked above, the first byte is either 0x00, 0x04, or 0x08 and both instances seem to have the same payload so my bet is on some bitflag stuff.

2) the 1C and 0D structures are similar in size and always show up next to each other, so maybe these are the RX and TX data from those papers?

Possible, but as I've said the HID report IDs don't really mean anything. There may still be some correlation. Note that it's also possible that some of it is multitouch data. I've seen the pen produce some interference (albeit small) with regards to touch. So I think it's quite possible that you also get some multitouch data mixed in.

(There are also two others that show up with or without the pen, 0x07 and 0x08, which I think are some sort of noise? They have an ipts_data size of just 64 bytes).

You could look at the HID report descriptor to figure that out. Using the one from https://github.com/linux-surface/intel-precise-touch/issues/4#issuecomment-784546961 (I think that should fit given the other IDs), reports 0x07 and 0x08 are also heatmap data. 0x07 with 61 bytes (1 byte report-id + 2 bytes scan-time + 61 bytes payload = 64 bytes) and 0x08 with 209 bytes (1 byte report-id + 2 bytes scan-time + 209 bytes payload = 212 bytes).

haymanpf commented 3 years ago

Ah shoot. Thanks for the tools though!

qzed commented 3 years ago

Feel free to poke around at the data with them, let me know if something doesn't work.

StollD commented 2 years ago

Should probably post this here too:

I've spent some time figuring out how the reports for the pen data are formatted. Together with the report names that @qzed already figured out, maybe someone has any ideas about the data in the reports

https://gist.github.com/StollD/ae901f46bf693a0c5355cc048cb21073

quo commented 2 years ago

I had a quick look at the pen data on a SP7+. The most interesting data seems to be in the DFT window packets with length 1548. I did not really look at the other packets yet.

Typically, there are three of these DFT packets in a row, which contain slightly different data. Let's call them P, Q, and R.

The structure of the packets is roughly as follows:

struct pen_dft {
    u32 timestamp; // counting at approx 8MHz
    u8 a; // always 16
    u8 b; // usually P=6, Q=7, R=9, very rarely P=3, Q=4, R=5
    u8 c; // usually 1, can be 0 if there are simultaneous touch events
    u8 d; // usually 1, can be 0 if there are simultaneous touch events
    u8 e; // usually 1, but can be higher (2,3,4) for the first few packets of a pen interaction
    u8 f; // data type? P=10, Q=10, R=11
    i16 g; // always -1

    struct pen_dft_row {
        u32 h; // fixed value, P/Q x and y use the same set of 16 values, R x and y use another set of 16 values
        u32 i; // value related to magnitude of dft components, possibly some kind of amplitude
        i16 real[9], imag[9]; // DFT components
        u32 j; // related to pen position: high for top/left, low for bottom/right
    } x[16], y[16]; // x rows encode x-coordinate data, y rows encode y-coordinate data
};

There are some additional patterns in the row data:

Edit: It looks like there are actually 64 DFT bins in the X direction and 44 bins in the Y direction, and we only get a 9 bin window from that. The j value is actually 3 signed bytes + a nul byte. The signed bytes are the indices of the first/last/middle bins in the window (since there 9 bins, they are always related by: first + 4 == last - 4 == middle). When the pen is at the edge of the screen, the first index can be negative or the last index can be larger than 63 or 43, and the corresponding components are zero.

quo commented 2 years ago

The smaller DFT window packets have the same format, but contain fewer rows. The "a" value is the number of rows per coordinate.

When the pen is touching the screen, the position can be determined fairly simply from the R1 rows by finding the bin index with the largest absolute value, then interpolating with its neighbors. E.g.:

#define NUM_DFT 9
struct pen_dft_row {
    u32 unknown;
    u32 magnitude;
    s16 real[NUM_DFT], imag[NUM_DFT];
    s8 first, last, mid, zero;
};

#define SCALE 0x10000
static int pen_get_pos(struct pen_dft_row *r) {
    int maxampsq = 0, maxidx = 0;
    for (int i = 0; i < NUM_DFT; i++) {
        int ampsq = r->real[i] * r->real[i] + r->imag[i] * r->imag[i];
        if (ampsq > maxampsq) {
            maxampsq = ampsq;
            maxidx = i;
        }
    }
    // interpolate using Eric Jacobsen's modified quadratic estimator
    int i = maxidx < 1 ? 1 : maxidx > NUM_DFT-2 ? NUM_DFT-2 : maxidx;
    int ra = r->real[i+1] - r->real[i-1], rb = 2*r->real[i] - r->real[i-1] - r->real[i+1];
    int ia = r->imag[i+1] - r->imag[i-1], ib = 2*r->imag[i] - r->imag[i-1] - r->imag[i+1];
    int div = rb*rb + ib*ib;
    int d = div ? SCALE * (s64)(ra*rb + ia*ib) / div : 0;
    return (r->first + i) * SCALE + (d > SCALE/2 ? SCALE/2 : d < -SCALE/2 ? -SCALE/2 : d);
}

The pen pressure can also be determined from R1. When pressure is high, the first row will have the largest DFT components and magnitude (i) value. When pressure is low, later rows will have the largest values. So pressure can be calculated by finding the index of the row with the largest magnitude value, then interpolating with neighboring rows.

Of course this is only using a fraction of the data the device sends, so presumably you would be able to get much better accuracy by using the rest of the data. Especially when the pen is not touching the screen, the R1 data is very noisy.

qzed commented 2 years ago

Awesome! And kinda interesting... so you're saying the DFT bins relate to position instead of some frequency? Or maybe a better interpretation is the DFT bin for some single frequency at some position (could fit the "antenna" name)... somehow this starts to make sense to me.

quo commented 2 years ago

I was mostly just looking at the data without thinking too much about what's going on at the hardware level.

From skimming the papers, it looks like they have horizontal and vertical wires, and are transmitting different frequencies on one set, while receiving on the other set. The pen creates a coupling between the wires, so running a DFT on the received data lets you figure which transmissions are most strongly coupled, ie. which wires the pen is nearest to. The pen can modulate the coupling or transmit its own frequencies to transmit data about pressure, etc.

After thinking about it a bit more, I'm not entirely sure the 9 values we get are actually bins from a single DFT. It could also be that they are all bins calculated at the exact same frequency, but from different RX wires (calculating a single bin is computationally very simple, you'd just multiply every sample with a sin and cos value). The unknown value at the start of each row could represent the bin frequency for the entire row.

Also, in the R1 data, the center bin almost always seems to have the largest value, so in the code I gave the for loop is unnecessary and you can just do int i = NUM_DFT/2 instead.

I should mention that I'm testing with an old SP4 pen, which according to MS doesn't support more advanced stuff like tilt. The PQ packet data just looks like noise to me, it's possible that's supposed to be something like tilt data which my pen just isn't sending.

qzed commented 2 years ago

After thinking about it a bit more, I'm not entirely sure the 9 values we get are actually bins from a single DFT. It could also be that they are all bins calculated at the exact same frequency, but from different RX wires (calculating a single bin is computationally very simple, you'd just multiply every sample with a sin and cos value). The unknown value at the start of each row could represent the bin frequency for the entire row.

Yes, that's what I meant with "bins relate to position [i.e. wire] instead of some frequency". I haven't had a detailed look at the papers yet, but I think that this would make sense from a logical perspective: Assuming the pen sends some fixed single-frequency signal (at least one frequency for position) and the wires receive that, then you could just check for where reception of that particular frequency is best (assuming there's notable damping over that "small" space).

Alternatively I guess the pen could be the receiver and wires could send different frequencies each, meaning that a single bin again relates to a specific wire (this time not position but frequency). Assuming damping over space and that each wire has the same output power, you'd then look for the most prominent frequency which gives you the closest wire. I'm not sure if that's the case though because then the pen would need to send the received data back somehow... having the pen just send data seems a lot easier to implement.

If you have one set of wires transmitting and another one receiving, maybe you again send different frequencies on different sender wires and look for them on the receiver ones. So on one hand you have strength over the specific frequencies giving the position over sending wires and strength over the full spectrum giving the position on the receiving side. Now I have no idea if that's physically possible because that'd somehow assume that without the pen receiving and sending sides are entirely (or mostly) disconnected and only the pen can "connect" them (maybe via frequency shifting?).

Again, I probably need to read the papers...

At some point I'll hopefully have access to the Pro X touch data (with new pen and all), but that will probably still take a while.

What I found quite interesting on the SB2: You can somehow have two pens on the panel and one still works as expected (i.e. there's no jumping between them). The other one just gets ignored completely. Not sure if that can be done entirely with software based filtering. But then again pens also have a serial number so that also needs to be sent somehow (so filtering could work via that if you can actually associate pen data directly with it).

gurrgur commented 2 years ago

I recommend this paper which goes into more detail about MFDM than the overview paper mentioned in https://github.com/linux-surface/intel-precise-touch/issues/14#issue-998560550.

From my understanding (and assuming MFDM), the pen encodes it's coordinate, pressure and (if applicable) tilt by transmitting a signal containing frequencies fs1, fs2 and fs3 from the pen tip. fs1 encodes the pen coordinate and is coupled into electrodes Rx and Tx for which charge signals Qs are measured. Therefore we can assume that at least some of the DFT bins we receive represent charge signals Qs of Tx and Rx evaluated at the single frequency bin fs1. On the other hand fs2 and fs3 are used to encode pressure and tilt. However the pen does not transmit them unaltered but shifts them proportional to the quantity they describe. Accordingly Δfs2 and Δfs3 are proportional to pressure and tilt. It's reasonable to assume that for pressure and tilt neighbouring DFT bins (frequency wise) of fs2 and fs3 are reported.

Finally, the reason for multiple pens not interfering with each other is that the MCU instructs each pen to transmit at specific frequencies. Thus with multiple pens available the MCU can assign them different frequencies. This seems to be more of a side benefit though, since the main reason for adjusting the pen's transmitting frequencies is to improve the SNR by avoiding parts of the spectrum polluted by externally induced noise.

Note that I'm writing this purely based on my understanding of some of the papers mentioned. I did not yet have a look at actual pen data reports.

csdvrx commented 2 years ago

On the other hand fs2 and fs3 are used to encode pressure and tilt. However the pen does not transmit them unaltered but shifts them proportional to the quantity they describe.

Would it be possible that some of these measurements are only possible dynamically, not statically?

After doing some tests with a Lenovo Active Pen # 5T70J33309 (Wacom), I found the hovering function didn't work unless the top part (the one with the writing tip) was screwed in.

There doesn't seem to be any electrical contact mechanism, unlike in the bottom part (AAAA battery holder with a spring).

Given how we also know the pen is continuously discharging its battery, this makes me believe:

It's just a bunch of speculations, and I'd need to test with different models and generations of Wacom pens, but it would seem to make sense.

Tilt was introduced after pressure sensitivity, so it's likely to be a very small improvement of the exact same method.

To do tilt in the same way, besides the tip of the pen Z1 I'd have added another mapping along the Z axis , for example the middle or the top of the pen: let's call it Z2.

Even if Z measurements are noisy, if both Z1 and Z2 are subject to the same noise, you could estimate the tilt by projecting both Z1 and Z2 on the X,Y plane - maybe after correcting by this noise if it's non homogeneous: the click event would be the perfect opportunity to get Z1star, from which you could compute Z1-Z1p as the noise on this part of the screen, and use it to better estimate the projection of Z2.

Accordingly Δfs2 and Δfs3 are proportional to pressure and tilt. It's reasonable to assume that for pressure and tilt neighbouring DFT bins (frequency wise) of fs2 and fs3 are reported.

Another explanation is that the deltas you are getting are Δfs2=Z1 and Δfs3 is either Z2 or Z1-Z2, while the actual click replaces Z1 by Z1star: so outside of a click event, you get an approximate proportional measurement, during a click event you get the precise value that allow you to get the correct pressure and tilt.

Especially when the pen is not touching the screen, the R1 data is very noisy

If my speculations above are correct, the R1 data is noisy due to being within the (min,max) that are dynamically calibrated to get the "hovering" function.

It should be possible to test my hypothesis by simulating clicks at various degrees (for 90 deg, put the pen very close to the screen and trigger a click of its tip with your fingernail), and plotting.

I've played a bit too much with my 5T70J33309 pen, testing it in various conditions and uh... now I can't unscrew it anymore lol so I won't be able to finish that testing :)

Also, I have some weird i915 issues on my lakefield X1 fold so I can't event do more basic stuff, like a high frequency display of the readings while also doing an external video record of the pen position on the screen: on linux, I just can't get the i915 driver to claim the video hardware, regardless of the forceprobe options (i915.forceprobe=9840 given lspci -nn), the distribution tried (Ubuntu 22 or Fedora 22 both containing the latest linux-firwmare for the huc and guc) or the kernel (5.17)

It's likely to be a stupid bug, hopefully my speculations may help a little with the IPTS protocol.

quo commented 2 years ago

Assuming the pen sends some fixed single-frequency signal (at least one frequency for position) and the wires receive that, then you could just check for where reception of that particular frequency is best (assuming there's notable damping over that "small" space).

Agreed, that makes most sense to me too. And that seems to match the actual data pretty well. I think they're also transmitting different frequencies on each wire, but perhaps only to be able to distinguish multiple fingers without ghosting.

I thought that some of the papers seemed to suggest the different wire frequencies are also used to determine the pen position somehow. But it's very possible I'm just misinterpeting things; the papers aren't exactly clearly written and I also haven't read them very closely yet.

fs1 encodes the pen coordinate and is coupled into electrodes Rx and Tx for which charge signals Qs are measured. Therefore we can assume that at least some of the DFT bins we receive represent charge signals Qs of Tx and Rx evaluated at the single frequency bin fs1.

Right. It looks like the fs1 data is in one of the smaller DFT packets, which I've now looked at. These packets have "f" values 6, 7, 8 and 9. For each "f" value, the number of rows is always the same and the data also looks the same, so I'm fairly certain "f" is the data type. I'm not sure what the "b" value represents, it may just be some kind of sequence number.

For each DFT data type, the unknown first values (possibly frequencies) for the rows are (at least on my SP7+):

Let's assume these values are (related to) frequencies for now. Some further observations about the different DFT data types follow:

Type 6:

Type 7/8:

Type 9:

Type 10:

Type 11:

The Magnitude data packet (0x5b) is also interesting. It can also be used to obtain the pen position. This packet has the following format:

struct pen_magnitude_data {
    u8 a, b;    // always zero
    u8 c, d;    // 0 if pen not near screen, 1 or 2 if pen is near screen
    u8 e;       // 0, 1 or 8 (bitflags?)
    u8 f, g, h; // always 0xff
    u32 x[64], y[44]; // signal magnitude value for each column/row
}
quo commented 2 years ago

I've added experimental SP7/SP7+ pen support to my iptsd fork: https://github.com/quo/iptsd

It will probably still need a lot of tuning, and the processing could be improved in various ways. And tilt support is still missing.

I refactored the existing parsing code a bit. Everything should still work, but some testing is needed to check if I didn't accidentally break anything for the older SP models. I do have some older SPs lying around, but they're not set up with the ipts driver currently. (Maybe it would be nice to gather some data dumps and add them to the repo, for automated testing?)

haymanpf commented 2 years ago

Thanks quo, that's awesome! Works on my SP7 with Renaisser pen.

Also just a heads-up, the debug tools proto-plot and proto-rt don't compile right away for me, they both call the single size parameter constructor for Parser. I compiled by giving 'invert_x' and 'invert_y' default values of 'false' in the Parser constructor.

quo commented 2 years ago

Thank you, should be fixed now.

freak007 commented 2 years ago

First of all many thanks for your great work ! I have a slight problem with my surface pro 7+. ithc working fine, and finger touch and is very reactive when iptsd service is not loaded. But when i run iptsd service in order to have pen working, finger touch is very erratic, i need to press several times and with much pressure to have it working with fingers, but pen is working well. Actually, I activate iptsd only when pen is needed for my usage.

quo commented 2 years ago

The multitouch processing is from the libqzed branch of the official iptsd, I'm not really familiar with that part of the code yet. I recommend opening an issue at https://github.com/linux-surface/iptsd/issues

Roethenbach commented 2 years ago

Also from me: Thanks for the great work!

I detected a slight problem with the pen: Sometimes lines drawed with the pen look more unprecise under Linux compared to Windows (both using Xournal++). Drawn arcs are more angular. It is not a big Problem, I only want to report that "bug"...

quo commented 2 years ago

I've uploaded some of my parsing code here: https://github.com/quo/surface-parser

It can parse data from ithc and from ipts-dbg. By default it gives hierarchical output, but there's also a special DFT mode for the pen data.

For reverse engineering of pen tilt data, see also: https://github.com/quo/iptsd/issues/3

qzed commented 2 years ago

FYI: The Surface Duo has a libsurfacetouch.so with debug symbols left in. Haven't looked at it in detail yet, but that might help. (Note: The Duo likely uses the same protocol and everything).

quo commented 2 years ago

Oh, nice! That could help solve some of the final mysteries.

qzed commented 2 years ago

That SO is available here: https://github.com/JengaMasterG/surface-duo-oss-vendor/tree/surfaceduo/10/2021.817.35/lib64

hobyst commented 2 years ago

I don't really know if it's helpful for you or if it's even okay for you to read this since you are reverse engineering the protocol, but the official documentation for the Microsoft Pen Protocol and the information on how pen input devices are implemented on Windows is publicly available, including some device descriptors.

quo commented 2 years ago

@hobyst Thanks. As far as I can tell all the public MS documentation mainly talks about standard HID data sent from (non-MS) pen/touch receiver ICs. I haven't found anything that describes the format of the big blob HID reports MS uses, or how the pen wirelessly communicates with the receiver. Presumably there is a spec detailing the wireless communication since there are 3rd party pens, but I don't think it's public.

quo commented 1 year ago

I've added some documentation to the wiki for anyone who wants to try to improve the DFT processing:

https://github.com/linux-surface/linux-surface/wiki/Pen-DFT-data-format

qzed commented 1 year ago

Oh, that's nice, thanks!