Open mashu opened 1 month ago
due to this, it seems that on my end x11/xwayland has stopped working. cannot even launch any proton games on xwayland.
Same here on 6.11 kernel when I try to enter sleep mode on wayland, Ubuntu 24.10 beta:
2024-09-20T22:59:50.819111+02:00 bernard-desktop kernel: CPU: 27 UID: 0 PID: 15484 Comm: nvidia-sleep.sh Kdump: loaded Tainted: G OE 6.11.0-7-generic #7-Ubuntu
2024-09-20T22:59:50.819112+02:00 bernard-desktop kernel: Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
2024-09-20T22:59:50.819112+02:00 bernard-desktop kernel: Hardware name: ASUS System Product Name/ROG CROSSHAIR VIII DARK HERO, BIOS 3801 07/30/2021
2024-09-20T22:59:50.819113+02:00 bernard-desktop kernel: RIP: 0010:follow_pte+0x1d7/0x200
2024-09-20T22:59:50.819113+02:00 bernard-desktop kernel: Code: 48 81 e2 00 00 00 c0 48 09 c2 48 f7 d2 48 85 fa 75 30 e8 1c e4 ff ff 48 8b 15 d5 28 92 01 48 81 e2 00 00 00 c0 e9 7b ff ff ff <0f> 0b e9 56 fe
ff ff 48 8b 45 d0 48 8b 38 e8 46 03 e9 00 e8 31 be
2024-09-20T22:59:50.819113+02:00 bernard-desktop kernel: RSP: 0018:ffffb0bb0708f770 EFLAGS: 00010246
2024-09-20T22:59:50.819114+02:00 bernard-desktop kernel: RAX: 0000000000000000 RBX: 0000713de4a06000 RCX: ffffb0bb0708f7c0
2024-09-20T22:59:50.819114+02:00 bernard-desktop kernel: RDX: ffffb0bb0708f7b8 RSI: 0000713de4a06000 RDI: ffff9077da98a398
2024-09-20T22:59:50.819115+02:00 bernard-desktop kernel: RBP: ffffb0bb0708f7a8 R08: ffffb0bb0708f978 R09: 0000000000000000
2024-09-20T22:59:50.819115+02:00 bernard-desktop kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffb0bb0708f808
2024-09-20T22:59:50.819115+02:00 bernard-desktop kernel: R13: 0000000000000000 R14: ffffb0bb0708f7b8 R15: ffff9077d24c9080
2024-09-20T22:59:50.819116+02:00 bernard-desktop kernel: FS: 00007d42cce13740(0000) GS:ffff907ecef80000(0000) knlGS:0000000000000000
2024-09-20T22:59:50.819116+02:00 bernard-desktop kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2024-09-20T22:59:50.819116+02:00 bernard-desktop kernel: CR2: 0000000086d79000 CR3: 0000000110136000 CR4: 0000000000f50ef0
2024-09-20T22:59:50.819117+02:00 bernard-desktop kernel: PKRU: 55555554
2024-09-20T22:59:50.819117+02:00 bernard-desktop kernel: Call Trace:
2024-09-20T22:59:50.819125+02:00 bernard-desktop kernel: <TASK>
2024-09-20T22:59:50.819125+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819126+02:00 bernard-desktop kernel: ? show_trace_log_lvl+0x273/0x310
2024-09-20T22:59:50.819126+02:00 bernard-desktop kernel: ? show_trace_log_lvl+0x273/0x310
2024-09-20T22:59:50.819128+02:00 bernard-desktop kernel: ? follow_phys+0x4c/0x110
2024-09-20T22:59:50.819129+02:00 bernard-desktop kernel: ? show_regs.part.0+0x22/0x30
2024-09-20T22:59:50.819129+02:00 bernard-desktop kernel: ? show_regs.cold+0x8/0x10
2024-09-20T22:59:50.819129+02:00 bernard-desktop kernel: ? follow_pte+0x1d7/0x200
2024-09-20T22:59:50.819130+02:00 bernard-desktop kernel: ? __warn.cold+0xa7/0x101
2024-09-20T22:59:50.819130+02:00 bernard-desktop kernel: ? follow_pte+0x1d7/0x200
2024-09-20T22:59:50.819130+02:00 bernard-desktop kernel: ? report_bug+0x114/0x160
2024-09-20T22:59:50.819131+02:00 bernard-desktop kernel: ? handle_bug+0x51/0xa0
2024-09-20T22:59:50.819131+02:00 bernard-desktop kernel: ? exc_invalid_op+0x18/0x80
2024-09-20T22:59:50.819131+02:00 bernard-desktop kernel: ? asm_exc_invalid_op+0x1b/0x20
2024-09-20T22:59:50.819132+02:00 bernard-desktop kernel: ? follow_pte+0x1d7/0x200
2024-09-20T22:59:50.819132+02:00 bernard-desktop kernel: follow_phys+0x4c/0x110
2024-09-20T22:59:50.819132+02:00 bernard-desktop kernel: untrack_pfn+0x55/0x130
2024-09-20T22:59:50.819132+02:00 bernard-desktop kernel: unmap_single_vma+0xbc/0xf0
2024-09-20T22:59:50.819133+02:00 bernard-desktop kernel: zap_page_range_single+0x138/0x210
2024-09-20T22:59:50.819133+02:00 bernard-desktop kernel: unmap_mapping_range+0x119/0x140
2024-09-20T22:59:50.819133+02:00 bernard-desktop kernel: nv_revoke_gpu_mappings_locked+0x46/0x80 [nvidia]
2024-09-20T22:59:50.819134+02:00 bernard-desktop kernel: nv_set_system_power_state+0x1d6/0x480 [nvidia]
2024-09-20T22:59:50.819134+02:00 bernard-desktop kernel: nv_procfs_write_suspend+0x102/0x1b0 [nvidia]
2024-09-20T22:59:50.819134+02:00 bernard-desktop kernel: proc_reg_write+0x6c/0xb0
2024-09-20T22:59:50.819135+02:00 bernard-desktop kernel: vfs_write+0x107/0x490
2024-09-20T22:59:50.819135+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819135+02:00 bernard-desktop kernel: ksys_write+0x71/0x100
2024-09-20T22:59:50.819136+02:00 bernard-desktop kernel: __x64_sys_write+0x19/0x30
2024-09-20T22:59:50.819136+02:00 bernard-desktop kernel: x64_sys_call+0x7e/0x22b0
2024-09-20T22:59:50.819136+02:00 bernard-desktop kernel: do_syscall_64+0x7e/0x170
2024-09-20T22:59:50.819136+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819137+02:00 bernard-desktop kernel: ? __do_sys_newfstat+0x76/0x80
2024-09-20T22:59:50.819159+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819160+02:00 bernard-desktop kernel: ? syscall_exit_to_user_mode+0x4e/0x250
2024-09-20T22:59:50.819160+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819160+02:00 bernard-desktop kernel: ? do_syscall_64+0x8a/0x170
2024-09-20T22:59:50.819160+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819161+02:00 bernard-desktop kernel: ? filp_flush+0x57/0x90
2024-09-20T22:59:50.819161+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819162+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819162+02:00 bernard-desktop kernel: ? syscall_exit_to_user_mode+0x4e/0x250
2024-09-20T22:59:50.819163+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819163+02:00 bernard-desktop kernel: ? do_syscall_64+0x8a/0x170
2024-09-20T22:59:50.819163+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819169+02:00 bernard-desktop kernel: ? irqentry_exit_to_user_mode+0x43/0x250
2024-09-20T22:59:50.819170+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819170+02:00 bernard-desktop kernel: ? irqentry_exit+0x43/0x50
2024-09-20T22:59:50.819170+02:00 bernard-desktop kernel: ? srso_alias_return_thunk+0x5/0xfbef5
2024-09-20T22:59:50.819171+02:00 bernard-desktop kernel: ? exc_page_fault+0x96/0x1c0
2024-09-20T22:59:50.819171+02:00 bernard-desktop kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
2024-09-20T22:59:50.819171+02:00 bernard-desktop kernel: RIP: 0033:0x7d42ccb26274
2024-09-20T22:59:50.819172+02:00 bernard-desktop kernel: Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d f5 2d 0f 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
2024-09-20T22:59:50.819172+02:00 bernard-desktop kernel: RSP: 002b:00007ffef22725d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
2024-09-20T22:59:50.819172+02:00 bernard-desktop kernel: RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007d42ccb26274
2024-09-20T22:59:50.819173+02:00 bernard-desktop kernel: RDX: 0000000000000008 RSI: 00005f6c7f01e520 RDI: 0000000000000001
2024-09-20T22:59:50.819173+02:00 bernard-desktop kernel: RBP: 00007ffef2272600 R08: 0000000000000000 R09: 0000000000000001
2024-09-20T22:59:50.819174+02:00 bernard-desktop kernel: R10: 00005f6c7f01e510 R11: 0000000000000202 R12: 0000000000000008
2024-09-20T22:59:50.819174+02:00 bernard-desktop kernel: R13: 00005f6c7f01e520 R14: 00007d42ccc125c0 R15: 00007d42ccc0fea0
2024-09-20T22:59:50.819179+02:00 bernard-desktop kernel: </TASK>
2024-09-20T22:59:50.819179+02:00 bernard-desktop kernel: ---[ end trace 0000000000000000 ]---
2024-09-20T22:59:50.819179+02:00 bernard-desktop kernel: ------------[ cut here ]------------
nvidia-smi:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3080 Off | 00000000:0B:00.0 On | N/A |
| 0% 37C P8 18W / 320W | 535MiB / 10240MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
This has been known for months: see #662
No need to create dupes.
Though actually it may help NVIDIA prioritize fixing this bug because it's annoying as hell.
I got 40KB worth of back traces on every suspend, and now I simply power off the PC entirely, since I got fed up with this.
now I simply power off the PC entirely, since I got fed up with this.
Same here. 560.35.03 and earlier. Archlinux, GeForce RTX 3050 Ti Laptop
This has been known for months: see #662
No need to create dupes.
Though actually it may help NVIDIA prioritize fixing this bug because it's annoying as hell.
I got 40KB worth of back traces on every suspend, and now I simply power off the PC entirely, since I got fed up with this.
Good to know, following #662 :)
This has been known for months: see #662
No need to create dupes.
Though actually it may help NVIDIA prioritize fixing this bug because it's annoying as hell.
I got 40KB worth of back traces on every suspend, and now I simply power off the PC entirely, since I got fed up with this.
It's important to keep issues distinct and avoid mislabeling them as duplicates without clear evidence. If you're experiencing suspend-related problems, it would be best to discuss those in a thread specifically addressing that issue, rather than here.
This bug report has totally different stack trace signature than #662 and original report didn't mention any suspend related issues.
I don't know if the #662 is related but on the closed-source side, this error is already known by Nvidia Nvidia forum.
I have the same problem on the Arch Linux and my workaround was to use the Linux-LTS 6.6.52-1-lts temporarily.
The problem was posted on the Arch forums since july Arch Forum
I hope they fix this soon, because 6.10 is basically incompatible with Nvidia drivers open or not without errors.
NVIDIA Open GPU Kernel Modules Version
560.35.03
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
Operating System and Version
Debian GNU/Linux trixie/sid
Kernel Release
6.10.9
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
Hardware: GPU
NVIDIA GeForce RTX 4090 Laptop GPU
Describe the bug
I am getting lots of errors and kernel tainted with stack in dmesg with latest nvidia driver 560.28.03-1 and linux kernel 6.10.3 (for full log see nvidia-bug-report.log.gz included in this report) on GNU/Linux Debian setup.
Short summary:
To Reproduce
Boot 6.10.9 kernel with latest official nvidia driver and check dmesg logs.
Bug Incidence
Always
nvidia-bug-report.log.gz
nvidia-bug-report.log.gz
Above nvidia-bug-report.log.gz includes this but also pasting here for convinience
More Info
No response