GpuZelenograd / memtest_vulkan

Vulkan compute tool for testing video memory stability
https://github.com/GpuZelenograd/memtest_vulkan/blob/main/Readme.md
zlib License
262 stars 12 forks source link

amdgpu: GPU reset begin! #23

Closed echaskaris closed 7 months ago

echaskaris commented 7 months ago

I tried the app, but the system crashed. Screen black, fans full, had to power down by long press of power button. The logs app in Ubuntu shows this in the hardware section (amdgpu:...), specifically: Message amdgpu 0000:29:00.0: amdgpu: GPU reset begin! , kernel device +pci:0000:29:00.0 , priority 6

Is my card faulty? Could it be a driver issue? Thank you

Ubuntu 22.04, AMD RX 480

Logs app also shows: 1. [drm:amdgpu_job_timedout [amdgpu]] ERROR ring gfx timeout, signaled seq=610763, emitted seq=610766

  1. [drm:amdgpu_job_timedout [amdgpu]] ERROR Process information: process memtest_vulkan pid 8380 thread memtest_vulkan pid 8380
galkinvv commented 7 months ago

Do other applications works fine with your GPU?

From my experience GPUS with severe hardware errors can hang during test start with similar sympthoms.

echaskaris commented 7 months ago

Sry for the late reply and thanks for answering

I tried the test again with verbose, and I removed a kernel parameter that enables GPU Overclocking (I had an undervolt, but i disabled it for the first and second test)

The test finished with no errors, the fans would not calm down afterwards though.

I can play fine, especially Team fortress 2, but I do have some issues, like frame drops or crashes on superposition and some crashes here and there, including in TF2.

Anyway, thank you! I attached logs for curiosity's sake, the old log is from the first test that crashed memtest_vulkan.log memtest_vulkan_old.log

galkinvv commented 7 months ago

So, if memtest_vulkan sometimes leads to crash and sometimes not - the issue looks to be hardware related - which can be discovered, since the test is quite intensive. I'm converting this from memtest_vulkan issue to the hardware results discussion.