godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
86.55k stars 19.27k forks source link

Running the game crashes my GPU (X11 session is killed) #72274

Open unfa opened 1 year ago

unfa commented 1 year ago

Godot version

4.0 dev beta 16

System information

Arch Linux + X11+ KDE Plasma + Radeon RX6800XT

Issue description

Recently quite often when running my game form the editor I get it working fro a while and then suddenly my entire desktop environment crashes. Dmesg shows this:

[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32806, for process Godot_v4.0-beta pid 43064 thread Godot_v4.0-beta pid 43064)
[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu:   in page starting at address 0x00008000bebf4000 from client 0x1b (UTCL2)
[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu:          Faulty UTCL2 client ID: SQC (data) (0xa)
[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu:          MORE_FAULTS: 0x0
[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu:          WALKER_ERROR: 0x0
[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu:          MAPPING_ERROR: 0x0
[sob sty 28 21:37:16 2023] amdgpu 0000:08:00.0: amdgpu:          RW: 0x0
[sob sty 28 21:37:26 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=2414917, emitted seq=2414919
[sob sty 28 21:37:26 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Godot_v4.0-beta pid 43064 thread Godot_v4.0-beta pid 43064
[sob sty 28 21:37:26 2023] amdgpu 0000:08:00.0: amdgpu: GPU reset begin!
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: free PSP TMR buffer
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: MODE1 reset
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: GPU mode1 reset
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: GPU smu mode1 reset
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: GPU reset succeeded, trying to resume
[sob sty 28 21:37:27 2023] [drm] PCIE GART of 512M enabled (table at 0x0000008000800000).
[sob sty 28 21:37:27 2023] [drm] VRAM is lost due to GPU reset!
[sob sty 28 21:37:27 2023] [drm] PSP is resuming...
[sob sty 28 21:37:27 2023] [drm] reserve 0xa00000 from 0x83fd000000 for PSP TMR
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: SMU is resuming...
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5600 (58.86.0)
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: SMU driver if version not matched
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: use vbios provided pptable
[sob sty 28 21:37:27 2023] amdgpu 0000:08:00.0: amdgpu: SMU is resumed successfully!
[sob sty 28 21:37:27 2023] [drm] DMUB hardware initialized: version=0x02020017
[sob sty 28 21:37:28 2023] [drm] kiq ring mec 2 pipe 1 q 0
[sob sty 28 21:37:28 2023] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[sob sty 28 21:37:28 2023] [drm] JPEG decode initialized successfully.
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring sdma2 uses VM inv eng 14 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring sdma3 uses VM inv eng 15 on hub 0
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring vcn_dec_1 uses VM inv eng 5 on hub 1
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 6 on hub 1
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring vcn_enc_1.1 uses VM inv eng 7 on hub 1
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: ring jpeg_dec uses VM inv eng 8 on hub 1
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: recover vram bo from shadow start
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: recover vram bo from shadow done
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] amdgpu 0000:08:00.0: amdgpu: GPU reset(2) succeeded!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm] Skip scheduling IBs!
[sob sty 28 21:37:28 2023] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!

I don't know what could be causing it, but it doesn't happen on Windows.

Steps to reproduce

The only way I can reproduce this is running my game Liblast.

In the project's README there's a guide to cloning the repository to get things working: https://codeberg.org/Liblast/Liblast#_how_to_edit_the_game

If you're familiar let me just say - you must clone the repo, not download the .ZIP, and you must install LFS or it won't work.

You can checkout this commit where I was able to reproduce the issue: aa02101bed

After you've cloned the repo:

  1. Open the Liblast/Game/project.godot project in Godot editor
  2. Hit F5 to run the game
  3. Let it run for a moment - you should be seeing characters fighting in the menu background
  4. Did it crash your X11 session?

Minimal reproduction project

I am afraid I don't have one :(

Calinou commented 1 year ago

Are you using the Mesa RADV driver or AMDVLK?

unfa commented 1 year ago

@Calinou I am not sure. Does this help?

❯ inxi -G
Graphics:
  Device-1: AMD Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] driver: amdgpu
    v: kernel
  Device-2: MacroSilicon USB Video type: USB
    driver: hid-generic,snd-usb-audio,usbhid,uvcvideo
  Display: x11 server: X.Org v: 21.1.6 with: Xwayland v: 22.1.7 driver: X:
    loaded: amdgpu dri: radeonsi gpu: amdgpu resolution: 1: 2560x1440~144Hz
    2: 1920x1080~60Hz
  API: OpenGL v: 4.6 Mesa 22.3.3 renderer: AMD Radeon RX 6800 XT (navi21
    LLVM 15.0.7 DRM 3.49 6.1.8-arch1-1)
Calinou commented 1 year ago

@unfa This looks like RADV, which is the default. I don't know of any Linux distribution that defaults to AMDVLK.

RADV is a community-developed driver, while AMDVLK is maintained by AMD and reuses parts of the Windows driver that were open sourced (only in this particular driver). Both are open source, although AMDVLK also exists in proprietary form as part of the AMDGPU-Pro package.

unfa commented 1 year ago

I think this crash is related to GPU Particles. I've filed an issue in Liblast project as well: https://codeberg.org/Liblast/Liblast/issues/414