pop-os / linux-firmware

Pop!_OS fork of https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux-firmware
Other
38 stars 12 forks source link

Unrecoverable crash with amdgpu on Pop!_OS 20.10 / AMD Ryzen 3200G integrated graphics #12

Open philipstewart opened 3 years ago

philipstewart commented 3 years ago

Hi,

Since the last linux-firmware update, I'm frequently experiencing graphical artefacts, followed by a display freeze and then a blank screen in the space of a few seconds. The system appears completely unresponsive to input - I can't drop to another TTY, reset the X server, or ssh to the machine. A hard restart with the power button appears to be my only option. I haven't found an explicit trigger to reproduce it yet.

Inspecting journalctl.log afterwards reveals a series of amdgpu errors and ends with a couple of segfaults in radeon_dri.so:

Apr 15 10:17:03 kernel: QSGRenderThread[8251]: segfault at 7f88f8701000 ip 00007f88fb26036e sp 00007f886e7fadb8 error 6 in radeonsi_dri.so[7f88fa9ff000+dcc000]
Apr 15 10:17:03 kernel: Code: 44 89 1a 44 89 04 87 8b 02 44 8d 40 01 44 89 02 89 0c 87 8b 02 48 c1 e9 20 44 8d 40 01 44 89 02 89 0c 87 8b 02 8d 48 01 89 0a <44> 89 0c 87 40 84 f6 74 22 41 83 e2 10 74 1c 8b 02 8d 48 01 89 0a
Apr 15 10:17:06 kernel: gnome-shell[2489]: segfault at 7f213045e000 ip 00007f212326ce06 sp 00007ffcc48d0ba8 error 6 in radeonsi_dri.so[7f21229ff000+dcc000]
Apr 15 10:17:06 kernel: Code: 09 83 f8 02 19 c0 83 e0 06 83 c0 06 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca 8b 0e 48 8b 46 08 41 83 c9 10 8d 79 01 89 3e <c7> 04 88 00 3c 05 c0 8b 0e 8d 79 01 89 3e 44 89 0c 88 8b 0e 8d 79

I'm using an up-to-date Pop!_OS 20.10 installation on a machine with integrated graphics:

Cheers, Phil

philipstewart commented 3 years ago

For what it's worth, after further crashes interrupted my work, I've downgraded to linux-firmware/groovy-updates,groovy-updates,now 1.190.3 and haven't experienced a crash in the days since.

Cheers, Phil

mirzov commented 3 years ago

Hi, I'm experiencing very similar symptoms on Ubuntu 20.04.2 LTS. Occasionally occurs when using Zoom or playing a videogame. Graphics: Radeon RX 570 Series (POLARIS10, DRM 3.35.0, 5.4.0-81-generic, LLVM 11.0.0) linux-firmware version: 1.187.15 mesa version: 20.2.6

Aug 25 10:15:58 kernel: [75365.388123] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out!
Aug 25 10:16:03 kernel: [75365.388191] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out!
Aug 25 10:16:03 kernel: [75370.522280] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2194314, emitted seq=2194316
Aug 25 10:16:03 kernel: [75370.522342] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 49996 thread Xorg:cs0 pid 50004
Aug 25 10:16:03 kernel: [75370.522347] amdgpu 0000:01:00.0: GPU reset begin!
Aug 25 10:16:03 kernel: [75370.964387] amdgpu 0000:01:00.0: GPU pci config reset
Aug 25 10:16:03 kernel: [75371.071828] amdgpu 0000:01:00.0: GPU reset succeeded, trying to resume
Aug 25 10:16:03 kernel: [75371.073115] [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
Aug 25 10:16:03 kernel: [75371.073126] [drm] VRAM is lost due to GPU reset!
Aug 25 10:16:03 kernel: [75371.107984] [drm] SADs count is: -2, don't need to read it
Aug 25 10:16:03 kernel: [75371.120675] [drm] SADs count is: -2, don't need to read it
Aug 25 10:16:03 kernel: [75371.159816] [drm] UVD and UVD ENC initialized successfully.
Aug 25 10:16:03 kernel: [75371.259815] [drm] VCE initialized successfully.
Aug 25 10:16:03 kernel: [75371.265743] [drm] recover vram bo from shadow start
Aug 25 10:16:03 kernel: [75371.271220] [drm] recover vram bo from shadow done
Aug 25 10:16:03 kernel: [75371.271223] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271224] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271329] amdgpu 0000:01:00.0: GPU reset(4) succeeded!
Aug 25 10:16:03 kernel: [75371.271344] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271351] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271356] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271359] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271365] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271367] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271373] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271376] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271381] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271384] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271388] [drm] Skip scheduling IBs!
Aug 25 10:16:03 gnome-shell[50195]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 25 10:16:03 gnome-shell[50195]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 25 10:16:04 firefox.desktop[50724]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 25 10:16:04 firefox.desktop[50724]: [GFX1-]: GFX: RenderThread detected a device reset in PostUpdate
Aug 25 10:16:06 kernel: [75373.844606] show_signal_msg: 24 callbacks suppressed
Aug 25 10:16:06 kernel: [75373.844609] QSGRenderThread[88149]: segfault at 7fc25c78c000 ip 00007fc2631dd608 sp 00007fc1bbffe4c8 error 6 in radeonsi_dri.so[7fc2629a5000+e54000]
Aug 25 10:16:06 kernel: [75373.844619] Code: 0c 90 8b 17 8b 4e 10 44 8d 42 01 44 89 07 89 0c 90 8b 17 8b 4e 08 44 8d 42 01 44 89 07 89 0c 90 8b 17 8b 4e 14 8d 72 01 89 37 <89> 0c 90 c3 0f 1f 40 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 53
Aug 25 10:16:07 systemd[1]: Starting Process error reports when automatic reporting is enabled...