HansKristian-Work / vkd3d-proton

Fork of VKD3D. Development branches for Proton's Direct3D 12 implementation.
GNU Lesser General Public License v2.1
1.75k stars 183 forks source link

trying to debug ray tracing dragon dogma 2 #2007

Open andrew-ld opened 1 month ago

andrew-ld commented 1 month ago

hello, i'm trying to debug a gpu page fault that happens when activating ray tracing on dragon dogma 2, however by activating qa checks i can't compile a shader, i don't know how to continue debugging

Software information

game: dragon dogma 2

i am using amdgpu pro vulkan driver because radv driver have a lot of crashes (latest mesa commit) even without ray tracing, amdvlk is stable but gives less fps than amdgpu pro

System information

VKD3D_DEBUG=trace VKD3D_LOG_FILE="/home/andrew/dd2-debug/log.txt" VKD3D_SHADER_DUMP_PATH="/home/andrew/dd2-debug/debug/" VKD3D_CONFIG="dxr,descriptor_qa_checks" MANGOHUD_CONFIG="no_display,fps_limit=48,fps_limit_method=early,fps_sampling_period=333,vsync" vk_pro gamemoderun mangohud %command%

Log files

andrew-ld commented 1 month ago

I tried using radv instead of amdvlk/pro and it seems to compile, I played for a while and some faults were logged by vkd3d.

VKD3D_DEBUG=trace VKD3D_LOG_FILE="/home/andrew/dd2-debug-radv/log.txt" VKD3D_SHADER_DUMP_PATH="/home/andrew/dd2-debug-radv/shaders/" VKD3D_CONFIG="dxr,descriptor_qa_checks" vk_radv gamemoderun %command%

logs: https://github.com/user-attachments/files/15537048/log.txt shaders with faults: https://github.com/user-attachments/files/15537081/shaders-with-fault.zip

i dont find in shaders directory c73feef558c73116 and daf3394534fa70c

andrew-ld commented 1 month ago

anyway I forgot to specify that to enable ray tracing you have to edit the dragon dogma 2 binary or hide on wine the symbol wine_get_version

you can simply bypass it by doing

sed -i “s/wine_get_version/win3_get_version/g” DD2.exe

HansKristian-Work commented 1 month ago

Could you try running with https://github.com/HansKristian-Work/vkd3d-proton/tree/dd2-heap-robustness-hack? Maybe it'll workaround the hang.

andrew-ld commented 1 month ago

it still crashes after few minutes

Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32801)
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu:  in process DD2.exe pid 12968 thread vkd3d_queue pid 13387)
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00008000f0600000 from client 10
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00501430
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: SQC (data) (0xa)
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
Jun 04 19:01:32 arch kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
Jun 04 19:01:42 arch kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32801)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:  in process DD2.exe pid 12968 thread vkd3d_queue pid 13387)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00008000ecc00000 from client 10
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00501431
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: SQC (data) (0xa)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32801)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:  in process DD2.exe pid 12968 thread vkd3d_queue pid 13387)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00008000ecc00000 from client 10
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: CB/DB (0x0)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32801)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:  in process DD2.exe pid 12968 thread vkd3d_queue pid 13387)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00008000ecc00000 from client 10
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: CB/DB (0x0)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32801)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:  in process DD2.exe pid 12968 thread vkd3d_queue pid 13387)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x00008000ecc00000 from client 10
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: CB/DB (0x0)
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
Jun 04 19:01:45 arch kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0