Open Cypresslin opened 4 days ago
I think this is a kernel bug in the DRI driver, I've cornered this to /dev/dri/card1, try the following; it reproduces the crash when I used the kernel kernel-testing--linux-oem-6.10
while true; do sudo ./stress-ng --dev 4 --dev-file /dev/dri/card1 -t 5; done
from my observations it occurs when the devices are being closed
Can trip it using single stressor instance too:while true; do sudo ./stress-ng --dev 1 --dev-file /dev/dri/card1 -t 5; done
Looks like a race in /dev/dri/card1 open/close. Here is a very simple reproducer, run as root:
#include <fcntl.h>
#include <unistd.h>
int main(void)
{
pid_t pid = fork();
while (1) {
int fd;
fd = openat(AT_FDCWD, "/dev/dri/card1", O_WRONLY|O_NONBLOCK|O_SYNC);
close(fd);
}
}
Definitely a kernel bug :-(
OK thanks for the investigation, I will give it a try with the mainline kernel.
Yes I can reproduce this issue with 6.10.0-061000rc4-generic (there is no debs for v6.10-rc6 amd64)
if you have an upstream kernel bug number or a Launchpad bug # for this please add it to this bug report so we can keep things tracked.
Oh I do have one launchpad bug report: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2071756
Tested on a vanilla 6.10-rc6 kernel, reported upstream: https://bugzilla.kernel.org/show_bug.cgi?id=219007
Hi Colin, I found that the dev stressor smoke test will kill a Noble OEM 6.10 VM.
Steps:
Test output:
This issue can be reproduced with V0.17.08 as well. And this test can pass on bare-metal with the same kernel.