jakeday / linux-surface

Linux Kernel for Surface Devices
2.6k stars 243 forks source link

Touch only works for a few seconds on Surface 3 (non-pro) #596

Open sr258 opened 4 years ago

sr258 commented 4 years ago

Nearly everything on my Surface 3 (non-pro, Wifi) works very well with Linux and the latest surface-linux version from qzed's repository (kernel: 5.3.6-surface, Deepin Linux 15.11). The only issue I'm having (besides a lot of low volume white noise on the headphone output) is that touch input only works for a few seconds. After I've used the touchscreen for about 2-3 seconds touch input is not registered anymore and only a system restart makes it work again. The period after which touch stops working begins when I first use the touchscreen, not after system startup. So, I can use my Surface for several minutes with keyboard and mouse and touch input will work even several minutes after startup (but stops working as described above).

I've noticed that the Surface 3 isn't on the supported devices lists anymore, but it'd great if someone could give me a pointer to how to fix this problem.

kitakar5525 commented 4 years ago

(@ other Surface owners: Surface 3 does not use IPTS)

Yes, I have the same issue on my Surface 3.

On dmesg, these messages will be printed repeatedly:

kern  :err   : [  +0.203592] Surface3-spi spi-MSHW0037:00: SPI transfer timed out
kern  :err   : [  +0.000173] spi_master spi1: failed to transfer one message from queue

@sr258 Reloading spi_pxa2xx_platform will temporarily fix this issue (after touch crash):

sudo mpdprobe -r spi_pxa2xx_platform
sudo modprobe spi_pxa2xx_platform

It seems that DMA is somehow related. What is interesting is that, this issue is not happening on Chromium OS based OS (kernel chromeos-4.19) at all.

Enable debug output:

sudo su -c 'echo "file drivers/spi/spi-pxa2xx.c +p" > /sys/kernel/debug/dynamic_debug/control'
sudo su -c 'echo "file drivers/input/touchscreen/surface3_spi.c +p" > /sys/kernel/debug/dynamic_debug/control'

On chromeos-4.19 kernel, it uses PIO:

kern  :debug : [  +0.009260] Surface3-spi spi-MSHW0037:00: 7692307 Hz actual, PIO
kern  :debug : [  +0.001105] Surface3-spi spi-MSHW0037:00: surface3_spi_irq_handler received -> ff ff ff ff a5 5a e7 7e 01 d2 00 80 01 03 03 24 00 e4 01 00 58 0b 58 0b 83 12 83 12 26 01 95 01 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

On the other hand, on Linux 4.19/5.2, it uses DMA:

kern  :debug : [  +0.006383] Surface3-spi spi-MSHW0037:00: 7692307 Hz actual, DMA
kern  :debug : [  +0.000495] Surface3-spi spi-MSHW0037:00: surface3_spi_irq_handler received -> ff ff ff ff a5 5a e7 7e 01 d2 00 80 01 03 03 18 00 e4 01 00 04 1a 04 1a e3 0c e3 0c b0 00 c5 00 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

So, using PIO mode could be the workaround for this issue, but I don't know how to do it.

Link (my memo): https://github.com/kitakar5525/note-linux-on-surface-3#touchscreen-is-not-stable

sr258 commented 4 years ago

I've experimented a bit further and it looks like using the kernel provided by jakeday's repo (5.1.15-surface-linux-surface) makes touch work again! Sadly, the kernel is not up-to-date then, but it's better than a broken touchscreen!

orychalk commented 4 years ago

Hello,

I've the same experience. It's work Great with 5.1.15, but not with others versions.

NB : The only things which don't work with the 5.1.15 after hibernate is backlight and halt/reboot.

kitakar5525 commented 4 years ago

Interestingly, on Arch Linux stock 5.1.15 kernel (5.1.15-arch1-1-ARCH), I have this issue. So, the cause of this issue might be a kernel config. I'll try kernels which are built with jakeday 5.1.15 config later.

kitakar5525 commented 4 years ago

I built Arch Linux 5.1.15 kernel with the jakeday 5.1 config and touch input is working even after suspend. I looked into the debug output and it uses PIO mode instead of DMA mode:

kern  :debug : [  328.623731] Surface3-spi spi-MSHW0037:00: 7692307 Hz actual, PIO
kern  :debug : [  328.624800] Surface3-spi spi-MSHW0037:00: surface3_spi_irq_handler received -> ff ff ff ff a5 5a e7 7e 01 d2 00 80 01 03 03 40 00 e4 02 00 b8 1a b8 1a 71 18 71 18 83 01 e9 01 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

So, the difference between DMA and PIO is possibly caused by the kernel config. I'll look into the difference later.

orychalk commented 4 years ago

Interesting indeed.

Thanks

kitakar5525 commented 4 years ago

Small update.

Finding what kernel configs determine to use DMA/PIO is difficult. Anyway, I think we should fix this issue on rather touchscreen driver side.

So, I'll try to disable DMA from driver side now. At least the following change seems to disable DMA and use PIO instead:

diff --git a/drivers/input/touchscreen/surface3_spi.c b/drivers/input/touchscreen/surface3_spi.c
index ce4828b14..05400181a 100644
--- a/drivers/input/touchscreen/surface3_spi.c
+++ b/drivers/input/touchscreen/surface3_spi.c
@@ -334,6 +334,9 @@ static int surface3_spi_probe(struct spi_device *spi)
        /* Set up SPI*/
        spi->bits_per_word = 8;
        spi->mode = SPI_MODE_0;
+       // Disable DMA
+       pr_alert("DEBUG: disabling DMA...\n");
+       spi->controller->can_dma = false;
        error = spi_setup(spi);
        if (error)
                return error;
kitakar5525 commented 4 years ago

Update: added parameter to switch DMA/PIO:

surface3-spi: add parameter to disable DMA

```diff From 08821ab4a4bab74eb42cc949b6a65ec4e819327a Mon Sep 17 00:00:00 2001 From: kitakar5525 <34676735+kitakar5525@users.noreply.github.com> Date: Fri, 6 Dec 2019 23:10:30 +0900 Subject: [PATCH] surface3-spi: add parameter to disable DMA --- drivers/input/touchscreen/surface3_spi.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/input/touchscreen/surface3_spi.c b/drivers/input/touchscreen/surface3_spi.c index ce4828b14..a04519a9b 100644 --- a/drivers/input/touchscreen/surface3_spi.c +++ b/drivers/input/touchscreen/surface3_spi.c @@ -25,6 +25,12 @@ #define SURFACE3_REPORT_TOUCH 0xd2 #define SURFACE3_REPORT_PEN 0x16 +bool use_dma = false; +module_param(use_dma, bool, 0644); +MODULE_PARM_DESC(use_dma, + "Disable DMA if you encounter touch input crash. " + "(default: false, disabled to avoid crash)"); + struct surface3_ts_data { struct spi_device *spi; struct gpio_desc *gpiod_rst[2]; @@ -326,6 +332,14 @@ static int surface3_spi_create_pen_input(struct surface3_ts_data *data) return 0; } +static bool surface3_spi_can_dma(struct spi_controller *ctlr, + struct spi_device *spi, + struct spi_transfer *tfr) +{ + dev_dbg(&spi->dev, "DEBUG: use_dma = %s\n", use_dma ? "true" : "false"); + return use_dma; +} + static int surface3_spi_probe(struct spi_device *spi) { struct surface3_ts_data *data; @@ -368,6 +382,12 @@ static int surface3_spi_probe(struct spi_device *spi) if (error) return error; + /* + * Set up DMA + * Currently, DMA seems to be broken. + */ + spi->controller->can_dma = surface3_spi_can_dma; + return 0; } -- 2.24.0 ```

With the patch, DMA will be disabled and use PIO by default. You can still switch DMA/PIO if you want:

# switch to DMA mode (may be broken)
echo 1 | sudo tee /sys/module/surface3_spi/parameters/use_dma
# back to PIO mode
echo 0 | sudo tee /sys/module/surface3_spi/parameters/use_dma

I think it's at least "usable" with PIO mode. Now, we should really fix DMA mode next but I feel it's beyond my capability… At least I'm not so sad to switch to PIO mode with the patch…

References:

Touchscreen driver for Surface 3:

PXA2xx SSP SPI Controller which surface3_spi uses:

Maybe this commit made surface3_spi to use DMA by default. Not sure.

Referred to "can_dma" part:

orychalk commented 4 years ago

Hello,

How can I apply this DMA/PIO patch to a manjaro? Do I have to compile the kernel ?

Thanks

kitakar5525 commented 4 years ago

Sorry for taking a long time, but I feel the patch is still not so clean for everyone to be merged into qzed/linux-surface. Hmm… I'll try to ask qzed.

orychalk commented 4 years ago

ok, thanks I'll waiting.

kitakar5525 commented 4 years ago

Made a PR here: https://github.com/qzed/linux-surface-kernel/pull/24

kitakar5525 commented 4 years ago

The patch has been merged into 5.3/5.4/4.19 kernels and prebuilt kernels released with the patch here: https://github.com/qzed/linux-surface/releases

Hopefully, the touch input crashes can be avoided now.

orychalk commented 4 years ago

It's work Many thanks :)

kitakar5525 commented 4 years ago

Thank you for your feedback!

hadess commented 3 years ago

To close like https://github.com/linux-surface/linux-surface/issues/76#issuecomment-757121281 ?

PaweX commented 2 years ago

I have just installed Arch Linux on my S3 with the newest surface kernel, and have this issue. Touchscreen crashes in a random moment, last time e.g. when I switched off on-screen keyboard. Pressing power button to logout and login again makes the touchscreen working again, but temporary to a next crash. It's very frustrating to use surface in this way, how to fix it in the newest kernel. Does anyone still have this issue?

Vaasis commented 1 year ago

I have just installed Arch Linux on my S3 with the newest surface kernel, and have this issue. Touchscreen crashes in a random moment, last time e.g. when I switched off on-screen keyboard. Pressing power button to logout and login again makes the touchscreen working again, but temporary to a next crash. It's very frustrating to use surface in this way, how to fix it in the newest kernel. Does anyone still have this issue?

I'm having the same issue running Fedora 36. Is there a way to apply the surface3-spi patch in the current kernel version, or perhaps an alternate method to switch between DMA/PIO for the touchscreen driver?

Not sure what causes the issue in the first place, but considering im in the lowest specced S3 (64GB eMMC and 2GB RAM) maybe the swap needs to be bigger, or GNOME with Wayland might be using too much resources?

Having tried ChromeOS Flex, it didn't present any touchscreen issues, though it did have other problems such as being stuck in 100% brightness; no auto-rotation and so on.