NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.37k stars 13.6k forks source link

radeon GPU lockup: system stalls #101735

Open thesamet opened 3 years ago

thesamet commented 3 years ago

Describe the bug I started to occasionally get the system lock up. So far this happened while playing a browser game in Chrome. The display turns black, and I am not able to use the keyboard to switch to another console. I use sway/wayland as a window manager.

After rebooting the machine I tried journalctl -b 1 and found the following:

Oct 25 20:56:21 makashonix kernel: radeon 0000:01:00.0: ring 3 stalled for more than 10201msec
Oct 25 20:56:21 makashonix kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x00000000015e83a5 last fence id 0x00000000015e83a6 o>
Oct 25 20:56:22 makashonix kernel: BUG: unable to handle page fault for address: ffffb7b000d6dffc
Oct 25 20:56:22 makashonix kernel: #PF: supervisor read access in kernel mode

an earlier occurrence, retrieved with journalctl -b 3 had the following information:

Oct 25 18:29:19 makashonix kernel: radeon 0000:01:00.0: ring 3 stalled for more than 10383msec
Oct 25 18:29:19 makashonix kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000881a0f3 last fence id 0x000000000881a0f4 o>
Oct 25 18:29:20 makashonix kernel: BUG: unable to handle page fault for address: ffff9d0d40b00ffc
Oct 25 18:29:20 makashonix kernel: #PF: supervisor read access in kernel mode
Oct 25 18:29:20 makashonix kernel: #PF: error_code(0x0000) - not-present page
Oct 25 18:29:20 makashonix kernel: PGD 81b524067 P4D 81b524067 PUD 0
Oct 25 18:29:20 makashonix kernel: Oops: 0000 [#1] SMP PTI
Oct 25 18:29:20 makashonix kernel: CPU: 3 PID: 29142 Comm: kworker/3:1H Not tainted 5.4.72 #1-NixOS
Oct 25 18:29:20 makashonix kernel: Hardware name: Dell Inc. XPS 8700/0KWVT8, BIOS A00 03/25/2013
Oct 25 18:29:20 makashonix kernel: Workqueue: radeon-crtc radeon_flip_work_func [radeon]
Oct 25 18:29:20 makashonix kernel: RIP: 0010:radeon_ring_backup+0xc0/0x140 [radeon]
Oct 25 18:29:20 makashonix kernel: Code: f8 49 89 06 48 85 c0 74 7b 41 8d 7c 24 ff 31 d2 48 c1 e7 02 eb 07 49 8b 06 48 83 c2 04 48 8b 75 0>
Oct 25 18:29:20 makashonix kernel: RSP: 0018:ffff9d0941307d48 EFLAGS: 00010202
Oct 25 18:29:20 makashonix kernel: RAX: ffff8d4c3e000000 RBX: 00000000ffffffff RCX: 0000000000000000
Oct 25 18:29:20 makashonix kernel: RDX: 0000000000000000 RSI: ffff9d0d40b00ffc RDI: 00000000000a3a40
Oct 25 18:29:20 makashonix kernel: RBP: ffff8d50937f94c8 R08: 000000000002e527 R09: fffff9d08ef84000
Oct 25 18:29:20 makashonix kernel: R10: 000000000002e520 R11: 0000000000000000 R12: 0000000000028e91
Oct 25 18:29:20 makashonix kernel: R13: ffff8d50937f94a8 R14: ffff9d0941307da8 R15: ffff8d50937f8000
Oct 25 18:29:20 makashonix kernel: FS:  0000000000000000(0000) GS:ffff8d509eac0000(0000) knlGS:0000000000000000
Oct 25 18:29:20 makashonix kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 25 18:29:20 makashonix kernel: CR2: ffff9d0d40b00ffc CR3: 0000000518a0a003 CR4: 00000000001606e0
Oct 25 18:29:20 makashonix kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 25 18:29:20 makashonix kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Oct 25 18:29:20 makashonix kernel: Call Trace:
Oct 25 18:29:20 makashonix kernel:  radeon_gpu_reset+0xb3/0x2e0 [radeon]
Oct 25 18:29:20 makashonix kernel:  ? radeon_fence_wait_timeout+0x5d/0xc0 [radeon]
Oct 25 18:29:20 makashonix kernel:  radeon_flip_work_func+0x1e0/0x230 [radeon]
Oct 25 18:29:20 makashonix kernel:  process_one_work+0x1eb/0x390
Oct 25 18:29:20 makashonix kernel:  worker_thread+0x4d/0x3f0
Oct 25 18:29:20 makashonix kernel:  kthread+0xfb/0x130
Oct 25 18:29:20 makashonix kernel:  ? process_one_work+0x390/0x390
Oct 25 18:29:20 makashonix kernel:  ? kthread_park+0x90/0x90
Oct 25 18:29:20 makashonix kernel:  ret_from_fork+0x35/0x40
Oct 25 18:29:20 makashonix kernel: Modules linked in: af_packet msr 8021q intel_rapl_msr intel_rapl_common radeon x86_pkg_temp_thermal int>
Oct 25 18:29:20 makashonix kernel:  nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip6t_rpfilter ipt_rpfilter ip6table_raw iptable_raw xt_pkttype>
Oct 25 18:29:20 makashonix kernel: CR2: ffff9d0d40b00ffc
Oct 25 18:29:20 makashonix kernel: ---[ end trace b6890888540a51c4 ]---
Oct 25 18:29:20 makashonix kernel: RIP: 0010:radeon_ring_backup+0xc0/0x140 [radeon]
Oct 25 18:29:20 makashonix kernel: Code: f8 49 89 06 48 85 c0 74 7b 41 8d 7c 24 ff 31 d2 48 c1 e7 02 eb 07 49 8b 06 48 83 c2 04 48 8b 75 0>
Oct 25 18:29:20 makashonix kernel: RSP: 0018:ffff9d0941307d48 EFLAGS: 00010202
Oct 25 18:29:20 makashonix kernel: RAX: ffff8d4c3e000000 RBX: 00000000ffffffff RCX: 0000000000000000
Oct 25 18:29:20 makashonix kernel: RDX: 0000000000000000 RSI: ffff9d0d40b00ffc RDI: 00000000000a3a40
Oct 25 18:29:20 makashonix kernel: RBP: ffff8d50937f94c8 R08: 000000000002e527 R09: fffff9d08ef84000
Oct 25 18:29:20 makashonix kernel: R10: 000000000002e520 R11: 0000000000000000 R12: 0000000000028e91
Oct 25 18:29:20 makashonix kernel: R13: ffff8d50937f94a8 R14: ffff9d0941307da8 R15: ffff8d50937f8000
Oct 25 18:29:20 makashonix kernel: FS:  0000000000000000(0000) GS:ffff8d509eac0000(0000) knlGS:0000000000000000
Oct 25 18:29:20 makashonix kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 25 18:29:20 makashonix kernel: CR2: ffff9d0d40b00ffc CR3: 00000004b9066006 CR4: 00000000001606e0
Oct 25 18:29:20 makashonix kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 25 18:29:20 makashonix kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Notify maintainers

Metadata Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

 - system: `"x86_64-linux"`
 - host os: `Linux 5.4.72, NixOS, 20.03.3178.a26e92a67d8 (Markhor)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.3.6`
 - channels(thesamet): `"home-manager-20.03"`
 - channels(root): `"home-manager-20.03, nixos-20.03.3178.a26e92a67d8"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info