TeamAOF / All-of-Fabric-6

Modpack containing the latest & best of Fabric on 1.19
58 stars 39 forks source link

AMD GPU crashes with segfault at random #399

Open SaphireLattice opened 9 months ago

SaphireLattice commented 9 months ago

Nothing relevant in the game logs, probably because it gets violently killed by the system before anything can come up. End up with the entire X session resetting and getting thrown to the login screen. So GPU is fine after, just, yeah.

Also, Iris is uninstalled. The shaders fail to compile on another modpack and that exception is not caught for some reason so, yeah, preemptively threw it out.

dmesg output:

[339846.028749] amdgpu 0000:0b:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32811, for process java pid 658486 thread java:cs0 pid 658716)
[339846.028757] amdgpu 0000:0b:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
[339846.028760] amdgpu 0000:0b:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101430
[339846.028762] amdgpu 0000:0b:00.0: amdgpu:     Faulty UTCL2 client ID: SQC (data) (0xa)
[339846.028764] amdgpu 0000:0b:00.0: amdgpu:     MORE_FAULTS: 0x0
[339846.028766] amdgpu 0000:0b:00.0: amdgpu:     WALKER_ERROR: 0x0
[339846.028768] amdgpu 0000:0b:00.0: amdgpu:     PERMISSION_FAULTS: 0x3
[339846.028769] amdgpu 0000:0b:00.0: amdgpu:     MAPPING_ERROR: 0x0
[339846.028771] amdgpu 0000:0b:00.0: amdgpu:     RW: 0x0
[339856.060146] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=24673070, emitted seq=24673072
[339856.060436] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process java pid 658486 thread java:cs0 pid 658716

Info:

From glxinfo -B:

Extended renderer info (GLX_MESA_query_renderer):
    Vendor: AMD (0x1002)
    Device: AMD Radeon RX 6800 XT (navi21, LLVM 16.0.6, DRM 3.54, 6.5.8-arch1-1) (0x73bf)
    Version: 23.2.1
    Accelerated: yes
    Video memory: 16384MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
guskikalola commented 8 months ago

Im also having the same issue on Linux using a 6700 XT

For now what I've done is setting Minecraft so that it uses integrated graphics, so far so good ( been playing for 10h )

From glxinfo -B:

Extended renderer info (GLX_MESA_query_renderer):
    Vendor: AMD (0x1002)
    Device: AMD Radeon RX 6700 XT (navi22, LLVM 16.0.6, DRM 3.54, 6.6.7-zen1-1-zen) (0x73df)
    Version: 23.2.1
    Accelerated: yes
    Video memory: 12288MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
MickTheRus commented 6 months ago

I have regrettably befallen upon the same issue. From glxinfo -B

Extended renderer info (GLX_MESA_query_renderer):
    Vendor: AMD (0x1002)
    Device: AMD Radeon RX 6700 XT (radeonsi, navi22, LLVM 16.0.6, DRM 3.56, 6.7.2-zen1-2-zen) (0x73df)
    Version: 23.3.5
    Accelerated: yes
    Video memory: 12288MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2

Have either of you used shaders by any chance? I do wonder if that could be possibly causing the issue, I guess I'll have to try it myslef.

MickTheRus commented 6 months ago

So I played for 2 hours exploring and shit and it seems fine. I also gave it 32 gigs but that's over the top, not let's see if I crash with shaders on default

SaphireLattice commented 6 months ago

So I played for 2 hours exploring and shit and it seems fine. I also gave it 32 gigs but that's over the top, not let's see if I crash with shaders on default

I sincerely doubt that it has anything to do with system RAM. Please make sure that there is at least something similar in dmesg. This is not just a game crash but a full graphical system mess.

MickTheRus commented 6 months ago

dmesg.log Unless I'm missing something seems to similar issue as yours.

MickTheRus commented 6 months ago

After disabling resize bar I have not had any crashes since, if I'm reading correctly it might be an issue with GLFW.

crash_report.txt

SubordinalBlue commented 5 months ago

GLFW and LWJGL are the two main libs MC uses that have caused Linux-only problems in the past. I'd suggest updating to the newest possible version of them.