linux-surface / iptsd

Userspace daemon for Intel Precise Touch & Stylus
GNU General Public License v2.0
94 stars 46 forks source link

Improve statistics of iptsd-perf #112

Closed danielzgtg closed 1 year ago

danielzgtg commented 1 year ago

This PR aims to improve the statistical behavior of iptsd-perf a bit:

Before

``` home@daniel-desktop3:~/CLionProjects/iptsd/build/src/debug$ ./iptsd-perf ../../../../iptsdump.dat [04:25:32.892] [info] Vendor: 045E [04:25:32.892] [info] Product: 099F [04:25:32.892] [info] Buffer Size: 7487 [04:25:32.892] [info] Metadata: [04:25:32.892] [info] rows=44, columns=64 [04:25:32.892] [info] width=25978, height=17319 [04:25:32.892] [info] transform=[-412.3492,0,25978,0,-402.76746,17319] [04:25:32.892] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2] [04:25:32.895] [info] Total: 2774.511μs [04:25:32.895] [info] Average: 13.149μs [04:25:32.895] [info] Minimum: 0.130μs [04:25:32.895] [info] Maximum: 23.970μs home@daniel-desktop3:~/CLionProjects/iptsd/build/src/debug$ ./iptsd-perf ../../../../iptsdump.dat [04:25:40.347] [info] Vendor: 045E [04:25:40.347] [info] Product: 099F [04:25:40.347] [info] Buffer Size: 7487 [04:25:40.347] [info] Metadata: [04:25:40.347] [info] rows=44, columns=64 [04:25:40.347] [info] width=25978, height=17319 [04:25:40.347] [info] transform=[-412.3492,0,25978,0,-402.76746,17319] [04:25:40.347] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2] [04:25:40.352] [info] Total: 4143.332μs [04:25:40.352] [info] Average: 19.637μs [04:25:40.352] [info] Minimum: 0.280μs [04:25:40.352] [info] Maximum: 49.830μs home@daniel-desktop3:~/CLionProjects/iptsd/build/src/debug$ ./iptsd-perf ../../../../iptsdump.dat [04:25:41.029] [info] Vendor: 045E [04:25:41.029] [info] Product: 099F [04:25:41.029] [info] Buffer Size: 7487 [04:25:41.029] [info] Metadata: [04:25:41.029] [info] rows=44, columns=64 [04:25:41.029] [info] width=25978, height=17319 [04:25:41.029] [info] transform=[-412.3492,0,25978,0,-402.76746,17319] [04:25:41.029] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2] [04:25:41.032] [info] Total: 3194.851μs [04:25:41.032] [info] Average: 15.141μs [04:25:41.032] [info] Minimum: 0.280μs [04:25:41.032] [info] Maximum: 48.470μs home@daniel-desktop3:~/CLionProjects/iptsd/build/src/debug$ ./iptsd-perf ../../../../iptsdump.dat [04:25:41.548] [info] Vendor: 045E [04:25:41.548] [info] Product: 099F [04:25:41.548] [info] Buffer Size: 7487 [04:25:41.548] [info] Metadata: [04:25:41.548] [info] rows=44, columns=64 [04:25:41.548] [info] width=25978, height=17319 [04:25:41.548] [info] transform=[-412.3492,0,25978,0,-402.76746,17319] [04:25:41.548] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2] [04:25:41.552] [info] Total: 3536.361μs [04:25:41.552] [info] Average: 16.760μs [04:25:41.552] [info] Minimum: 0.210μs [04:25:41.552] [info] Maximum: 48.270μs home@daniel-desktop3:~/CLionProjects/iptsd/build/src/debug$ ./iptsd-perf ../../../../iptsdump.dat [04:25:42.107] [info] Vendor: 045E [04:25:42.107] [info] Product: 099F [04:25:42.107] [info] Buffer Size: 7487 [04:25:42.107] [info] Metadata: [04:25:42.107] [info] rows=44, columns=64 [04:25:42.107] [info] width=25978, height=17319 [04:25:42.107] [info] transform=[-412.3492,0,25978,0,-402.76746,17319] [04:25:42.107] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2] [04:25:42.111] [info] Total: 3853.412μs [04:25:42.111] [info] Average: 18.263μs [04:25:42.111] [info] Minimum: 0.210μs [04:25:42.111] [info] Maximum: 36.130μs home@daniel-desktop3:~/CLionProjects/iptsd/build/src/debug$ ./iptsd-perf ../../../../iptsdump.dat [04:25:42.819] [info] Vendor: 045E [04:25:42.819] [info] Product: 099F [04:25:42.819] [info] Buffer Size: 7487 [04:25:42.819] [info] Metadata: [04:25:42.819] [info] rows=44, columns=64 [04:25:42.819] [info] width=25978, height=17319 [04:25:42.819] [info] transform=[-412.3492,0,25978,0,-402.76746,17319] [04:25:42.819] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2] [04:25:42.823] [info] Total: 3897.692μs [04:25:42.823] [info] Average: 18.472μs [04:25:42.823] [info] Minimum: 0.210μs [04:25:42.823] [info] Maximum: 38.470μs home@daniel-desktop3:~/CLionProjects/iptsd/build/src/debug$ ./iptsd-perf ../../../../iptsdump.dat [04:25:43.787] [info] Vendor: 045E [04:25:43.787] [info] Product: 099F [04:25:43.787] [info] Buffer Size: 7487 [04:25:43.787] [info] Metadata: [04:25:43.787] [info] rows=44, columns=64 [04:25:43.787] [info] width=25978, height=17319 [04:25:43.787] [info] transform=[-412.3492,0,25978,0,-402.76746,17319] [04:25:43.787] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2] [04:25:43.792] [info] Total: 4129.942μs [04:25:43.792] [info] Average: 19.573μs [04:25:43.792] [info] Minimum: 0.280μs [04:25:43.792] [info] Maximum: 49.800μs ```

The average is confusingly all over the place.

After

The mean is much more consistent now:

/home/home/CLionProjects/iptsd/build/src/debug/iptsd-perf ./iptsdump.dat
[07:55:47.757] [info] Vendor:       045E
[07:55:47.757] [info] Product:      099F
[07:55:47.757] [info] Buffer Size:  7487
[07:55:47.757] [info] Metadata:
[07:55:47.757] [info] rows=44, columns=64
[07:55:47.757] [info] width=25978, height=17319
[07:55:47.757] [info] transform=[-412.3492,0,25978,0,-402.76746,17319]
[07:55:47.757] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2]
[07:55:47.795] [info] Ran 2010 times
[07:55:47.795] [info] Total: 28143μs
[07:55:47.795] [info] Mean: 14.00μs
[07:55:47.795] [info] Standard Deviation: 0.62μs
[07:55:47.795] [info] Minimum: 13.830μs
[07:55:47.795] [info] Maximum: 24.550μs

There is another run with the advanced instead of basic detection. I would have set LOOP_COUNT to 1000 not 10, but the advanced algorithm would take too long. Here it is below:

/home/home/CLionProjects/iptsd/build/src/debug/iptsd-perf ./iptsdump.dat
[08:03:23.027] [info] Vendor:       045E
[08:03:23.027] [info] Product:      099F
[08:03:23.027] [info] Buffer Size:  7487
[08:03:23.027] [info] Metadata:
[08:03:23.027] [info] rows=44, columns=64
[08:03:23.027] [info] width=25978, height=17319
[08:03:23.027] [info] transform=[-412.3492,0,25978,0,-402.76746,17319]
[08:03:23.027] [info] unknown=1, [178,182,180,1,178,182,180,1,90,171,100,20,172,177,175,2]
[08:03:29.048] [info] Ran 2010 times
[08:03:29.048] [info] Total: 6011079μs
[08:03:29.048] [info] Mean: 2990.59μs
[08:03:29.048] [info] Standard Deviation: 1045.43μs
[08:03:29.048] [info] Minimum: 147.510μs
[08:03:29.048] [info] Maximum: 5385.695μs

From the new perf display, we can learn that the advanced detection has a much higher variance-to-mean ratio than the basic detection.

qzed commented 1 year ago

Nice! Those improvements definitely make sense for good/reliable performance testing.

From the new perf display, we can learn that the advanced detection has a much higher variance-to-mean ratio than the basic detection.

I believe the explanation for this is that the advanced processing depends heavily on the number of contacts on the screen. For example, the whole Gaussian-fitting process (which IIRC is non-negligible in time consumption) is linear in the number of pre-filtered contacts and the weighted distance transform is linear in the sum of the blob sizes (although I'm not sure how much the latter contributes to the total time). So I think there's bound to be a lot of variation between frames with no contacts and frames with even a single contact.

StollD commented 1 year ago

Thank you, for your changes and for putting up with my suggestions ;)