NVIDIA / open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source
Other
15.06k stars 1.25k forks source link

Computer crashes when playing games (screen freeze and reboot) #358

Closed noahboegli closed 2 years ago

noahboegli commented 2 years ago

NVIDIA Open GPU Kernel Modules Version

515.65.01-9

Does this happen with the proprietary driver (of the same version) as well?

Yes

Operating System and Version

Arch Linux

Kernel Release

5.19.5-arch1-1

Hardware: GPU

GPU 0: NVIDIA GeForce RTX 3070 Ti (UUID: GPU-10dd02b9-41d1-a7bb-1fa1-982e157cd5e5)

Describe the bug

When playing games (such as X-Plane, Prepar3D or even Minecraft), the display freezes for a short period of time and the computer reboots.

The issue also happens on Windows with the latest driver release.

I am unable to reproduce the crash when stress-testing the card. It only happens when playing games.

There is are no overheating issues (room is 25, CPU 60 and GPU less than 70).

The issue also happened with my GTX 1070, which I replaced because I thought it was dying of age.

There are no logs entries in the sysylog.

To Reproduce

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

It is not a PSU problem given that it has been replaced a week ago, thinking the problem was coming from here.

balenamiaa commented 2 years ago

Why was this closed @noahboegli?

noahboegli commented 2 years ago

@balenamiaa I closed it because it was not linked to the GPU (hardware or software) . I should have added the explanation but I closed it on the phone and forgot to come back to add a note, so here it is:

It was in fact due to an issue with the PCI-E lane speed of my motherboard (an ASUS ROG STRIX Z490-E GAMING) . The latest firmware version fixed it.

I've bought a new GPU, a new PSU but all of the issues have been solved with a (free) BIOS flash that took 10 minutes.

It's amazing because I've googled and read literally hundreds of pages on the issue over 2 weeks, only to find one that mentioned the motherboard firmware right after I updated the firmware out of pure despair.

balenamiaa commented 2 years ago

Thanks, your explanation is very useful!