Open philipstewart opened 3 years ago
For what it's worth, after further crashes interrupted my work, I've downgraded to linux-firmware/groovy-updates,groovy-updates,now 1.190.3
and haven't experienced a crash in the days since.
Cheers, Phil
Hi, I'm experiencing very similar symptoms on Ubuntu 20.04.2 LTS. Occasionally occurs when using Zoom or playing a videogame. Graphics: Radeon RX 570 Series (POLARIS10, DRM 3.35.0, 5.4.0-81-generic, LLVM 11.0.0) linux-firmware version: 1.187.15 mesa version: 20.2.6
Aug 25 10:15:58 kernel: [75365.388123] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out!
Aug 25 10:16:03 kernel: [75365.388191] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out!
Aug 25 10:16:03 kernel: [75370.522280] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2194314, emitted seq=2194316
Aug 25 10:16:03 kernel: [75370.522342] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 49996 thread Xorg:cs0 pid 50004
Aug 25 10:16:03 kernel: [75370.522347] amdgpu 0000:01:00.0: GPU reset begin!
Aug 25 10:16:03 kernel: [75370.964387] amdgpu 0000:01:00.0: GPU pci config reset
Aug 25 10:16:03 kernel: [75371.071828] amdgpu 0000:01:00.0: GPU reset succeeded, trying to resume
Aug 25 10:16:03 kernel: [75371.073115] [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
Aug 25 10:16:03 kernel: [75371.073126] [drm] VRAM is lost due to GPU reset!
Aug 25 10:16:03 kernel: [75371.107984] [drm] SADs count is: -2, don't need to read it
Aug 25 10:16:03 kernel: [75371.120675] [drm] SADs count is: -2, don't need to read it
Aug 25 10:16:03 kernel: [75371.159816] [drm] UVD and UVD ENC initialized successfully.
Aug 25 10:16:03 kernel: [75371.259815] [drm] VCE initialized successfully.
Aug 25 10:16:03 kernel: [75371.265743] [drm] recover vram bo from shadow start
Aug 25 10:16:03 kernel: [75371.271220] [drm] recover vram bo from shadow done
Aug 25 10:16:03 kernel: [75371.271223] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271224] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271329] amdgpu 0000:01:00.0: GPU reset(4) succeeded!
Aug 25 10:16:03 kernel: [75371.271344] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271351] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271356] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271359] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271365] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271367] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271373] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271376] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271381] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271384] [drm] Skip scheduling IBs!
Aug 25 10:16:03 kernel: [75371.271388] [drm] Skip scheduling IBs!
Aug 25 10:16:03 gnome-shell[50195]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 25 10:16:03 gnome-shell[50195]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 25 10:16:04 firefox.desktop[50724]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 25 10:16:04 firefox.desktop[50724]: [GFX1-]: GFX: RenderThread detected a device reset in PostUpdate
Aug 25 10:16:06 kernel: [75373.844606] show_signal_msg: 24 callbacks suppressed
Aug 25 10:16:06 kernel: [75373.844609] QSGRenderThread[88149]: segfault at 7fc25c78c000 ip 00007fc2631dd608 sp 00007fc1bbffe4c8 error 6 in radeonsi_dri.so[7fc2629a5000+e54000]
Aug 25 10:16:06 kernel: [75373.844619] Code: 0c 90 8b 17 8b 4e 10 44 8d 42 01 44 89 07 89 0c 90 8b 17 8b 4e 08 44 8d 42 01 44 89 07 89 0c 90 8b 17 8b 4e 14 8d 72 01 89 37 <89> 0c 90 c3 0f 1f 40 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 53
Aug 25 10:16:07 systemd[1]: Starting Process error reports when automatic reporting is enabled...
Hi,
Since the last linux-firmware update, I'm frequently experiencing graphical artefacts, followed by a display freeze and then a blank screen in the space of a few seconds. The system appears completely unresponsive to input - I can't drop to another TTY, reset the X server, or ssh to the machine. A hard restart with the power button appears to be my only option. I haven't found an explicit trigger to reproduce it yet.
Inspecting journalctl.log afterwards reveals a series of amdgpu errors and ends with a couple of segfaults in radeon_dri.so:
I'm using an up-to-date Pop!_OS 20.10 installation on a machine with integrated graphics:
Cheers, Phil