linux-automation / meta-lxatac

Build your own LXA TAC images and bundles
MIT License
5 stars 15 forks source link

kernel: Add a patch to fix an NPD when using the UART and a stuck irq #134

Closed ukleinek closed 5 months ago

ukleinek commented 5 months ago

When /dev/ttySTM1 is closed while there are still some characters pending to be sent, the kernel dereferences a null pointer which locks up the kernel.

Fix that by picking a patch from today's linux-next.

While the commit log claims this indeed fixes the problem noticed by @SmithChart, this isn't tested yet.

hnez commented 5 months ago

This seems to solve part of the problem. The good thing is that the kernel errors seen when closing a /dev/ttySTM* while there is still pending data is gone and now there is only a:

$ dmesg -w
…
[   68.147733] stm32-usart 4000f000.serial: Transmission is not complete
…

in its place.

But it gets worse when one tries to use the device again:

$ dmesg -w
…
[   78.434633] panel-mipi-dbi-spi spi2.0: SPI transfer timed out
[   78.439387] spi_master spi2: failed to transfer one message from queue
[   78.450284] spi_master spi2: noqueue transfer failed
[   78.454254] panel-mipi-dbi-spi spi2.0: error -110 when sending command 0x2a
[   78.461289] lmp92064 spi1.0: SPI transfer timed out
…

followed by a watchdog induced reboot.

To me this looks like e.g. lower-priority interrupts no longer being handled.

SmithChart commented 5 months ago

followed by a watchdog induced reboot. To me this looks like e.g. lower-priority interrupts no longer being handled.

What @hnez reports still sounds like the behavior I've observed.

ukleinek commented 5 months ago

Closed by mistake, I think by accidentally pushing a wrong rev to my branch. Anyhow, now it should be right and fixed.

hnez commented 5 months ago

I've tested this PR now and it looks like it fixes the issues we had. Nice to see this endeavour come to an end!

Before we merge it however #136 should be merged first and this PR be rebased so that the version number consistency check does not throw errors.

SmithChart commented 5 months ago

136 has been merged. Please rebase.

ukleinek commented 5 months ago

I don't know what the version consistency check is about, but I rebased anyhow.

ukleinek commented 5 months ago

(Pushed once more because I got the committer info wrong in the first force-push)

hnez commented 5 months ago

I don't know what the version consistency check is about, but I rebased anyhow.

It makes sure that we do not generate any images containing e.g. 24.04 as a version number after the 24.04 version tag is already set and instead have 24.04+dev as a version in meta-lxatac-software/conf/distro/tacos.conf.

PS: I've accidentially pressed "Close with comment" instead of "Cancel". When I decided that I did not want to add this comment. As a result I've added the comment in a half-ready state and closed the PR, both of which I did not mean to do. Oh my.