lutris / lutris

Lutris desktop client
https://lutris.net
GNU General Public License v3.0
7.83k stars 690 forks source link

GPU Crash Diablo 4 Ray Tracing activated #5389

Closed JulienDlq closed 7 months ago

JulienDlq commented 7 months ago

I recently activated ray tracing in Diablo IV since it is now officially released.

But after a fiew seconds/minutes of play, the game freezes and stays in this state, but the sound works. I have to kill it in order to stop it.

If I deactivate ray tracing, no crash at all, everything is fine.

Here is the content of logs:

Started initial process 6420 from /home/loup/.local/share/lutris/runners/wine/wine-ge-8-26-x86_64/bin/wine /media/loup/disque1to/Lutris/battlenet/drive_c/Program Files (x86)/Battle.net/Battle.net.exe --exec=launch fenris
Start monitoring process.
fsync: up and running.
wine: Using setpriority to control niceness in the [-19,19] range
[0328/081832.269:ERROR:network_change_notifier_win.cc(143)] WSALookupServiceBegin failed with: 0
[0328/081832.359:ERROR:network_sandbox.cc(302)] Failed to grant sandbox access to cache directory C:\users\loup\AppData\Local\Battle.net\BrowserCaches\common\Cache\Cache_Data: Procdure introuvable. (0x7F)
[0328/081832.363:ERROR:network_sandbox.cc(396)] Failed to grant sandbox access to network context data directory C:\users\loup\AppData\Local\Battle.net\BrowserCaches\common\Network: Succs. (0x0)
[0328/081832.366:ERROR:network_service_instance_impl.cc(270)] Encountered error while migrating network context data or granting sandbox access for C:\users\loup\AppData\Local\Battle.net\BrowserCaches\common\Network. Result: 6: Succs. (0x0)
[0328/081832.737:ERROR:network_change_notifier_win.cc(143)] WSALookupServiceBegin failed with: 0
[0328/081832.763:ERROR:dxva_video_decode_accelerator_win.cc(1459)] DXVAVDA fatal error: could not LoadLibrary: msvproc.dll: Module introuvable. (0x7E)
wine client error:570: write: Mauvais descripteur de fichier
wine client error:60c: write: Mauvais descripteur de fichier
133.081:06d8:06dc:fixme:vkd3d-proton:d3d12_dred_settings_SetAutoBreadcrumbsEnablement: iface 000000001a724270, enablement 2 stub!
133.081:06d8:06dc:fixme:vkd3d-proton:d3d12_dred_settings_SetPageFaultEnablement: iface 000000001a724270, enablement 2 stub!
133.081:06d8:06dc:fixme:vkd3d-proton:d3d12_dred_settings_SetWatsonDumpEnablement: iface 000000001a724270, enablement 0 stub!
133.081:06d8:06dc:info:vkd3d-proton:vkd3d_instance_apply_application_workarounds: Program name: "Diablo IV.exe" (hash: e955b0402a6703a3)
133.081:06d8:06dc:info:vkd3d-proton:vkd3d_instance_deduce_config_flags_from_environment: shader_cache is used, global_pipeline_cache is enforced.
133.081:06d8:06dc:info:vkd3d-proton:vkd3d_config_flags_init_once: VKD3D_CONFIG=''.
133.083:06d8:06dc:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 2.12.0.
133.083:06d8:06dc:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 7460c70de0dff08.
133.543:06d8:06dc:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: Device heaps are split. Assuming small BAR situation. Using HOST_COHERENT only.
133.544:06d8:06dc:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Enabling fast paths for advanced ExecuteIndirect() graphics and compute.
133.544:06d8:06dc:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device supports VK_EXT_mutable_descriptor_type.
133.544:06d8:06dc:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device supports ultra-fast path for descriptor copies.
133.544:06d8:06dc:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device supports packed metadata path for descriptor copies.
133.544:06d8:06dc:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
133.544:06d8:06dc:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
133.544:06d8:06dc:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
133.572:06d8:06dc:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
133.572:06d8:06dc:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 3840, may be inaccurate.
133.572:06d8:06dc:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
133.572:06d8:06dc:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR 1.1 support enabled.
133.572:06d8:06dc:info:vkd3d-proton:d3d12_device_caps_init_feature_level: DX Ultimate supported!
133.572:06d8:06dc:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
133.572:06d8:06dc:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
133.573:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
133.575:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: Merging disk caches.
133.625:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: Done merging shader caches, existing entries: 18008, new entries: 532.
133.626:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: Successfully replaced shader cache with merged cache.
133.626:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 52.835 ms.
133.626:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.295 ms.
133.636:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 10.047 ms.
133.636:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
133.699:06d8:06dc:info:vkd3d-proton:dxgi_vk_swap_chain_init: Creating swapchain (1920 x 1080), BufferCount = 3.
133.700:06d8:06dc:info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Enabling frame latency handles.
133.700:06d8:06dc:info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Ensure maximum latency of 3 frames with KHR_present_wait.
133.703:06d8:06dc:info:vkd3d-proton:dxgi_vk_swap_chain_init_waiter_thread: Enabling present wait path for frame latency.
136.849:06d8:0780:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
136.865:06d8:0780:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
172.715:06d8:07bc:err:vkd3d-proton:d3d12_command_allocator_Reset: There are still 1 pending command lists awaiting execution from command allocator iface 000000001a7db980!
218.547:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Pipeline cache marked dirty. Flush is scheduled.
219.561:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Flushing disk cache (wakeup counter since last flush = 87). It seems like application has stopped creating new PSOs for the time being.
[0328/082033.013:ERROR:dxva_video_decode_accelerator_win.cc(1459)] DXVAVDA fatal error: could not LoadLibrary: msvproc.dll: Module introuvable. (0x7E)
[0328/082033.015:ERROR:gpu_init.cc(523)] Passthrough is not supported, GL is disabled, ANGLE is 
240.587:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Pipeline cache marked dirty. Flush is scheduled.
241.623:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Flushing disk cache (wakeup counter since last flush = 26). It seems like application has stopped creating new PSOs for the time being.
radv: GPUVM fault detected at address 0x80000e8c9000.
GCVM_L2_PROTECTION_FAULT_STATUS: 0x201430
     CLIENT_ID: (SQC (data)) 0xa
     MORE_FAULTS: 0
     WALKER_ERROR: 0
     PERMISSION_FAULTS: 3
     MAPPING_ERROR: 0
     RW: 0
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:077c:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:0784:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:0784:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:0784:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:0784:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.282:06d8:0784:err:vkd3d-proton:vkd3d_wait_for_gpu_timeline_semaphore: Failed to wait for Vulkan timeline semaphore, vr -4.
269.298:06d8:0780:err:vkd3d-proton:d3d12_command_queue_execute: Failed to submit queue(s), vr -4.
269.298:06d8:0780:err:vkd3d-proton:d3d12_command_queue_signal: Failed to submit signal operation, vr -4.
269.298:06d8:0780:err:vkd3d-proton:d3d12_command_queue_signal: Failed to submit signal operation, vr -4.
269.298:06d8:0780:err:vkd3d-proton:d3d12_command_queue_signal: Failed to submit signal operation, vr -4.
280.469:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Pipeline cache marked dirty. Flush is scheduled.
281.476:06d8:0778:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Flushing disk cache (wakeup counter since last flush = 41). It seems like application has stopped creating new PSOs for the time being.
[0328/082139.577:ERROR:angle_platform_impl.cc(43)] Renderer11.cpp:2213 (testDeviceLost): The D3D11 device was removed, HRESULT: 0x887A0007
ERR: Renderer11.cpp:2213 (testDeviceLost): The D3D11 device was removed, HRESULT: 0x887A0007
[0328/082139.578:ERROR:shared_context_state.cc(859)] SharedContextState context lost via ARB/EXT_robustness. Reset status = GL_UNKNOWN_CONTEXT_RESET_KHR
[0328/082139.578:ERROR:gpu_service_impl.cc(988)] Exiting GPU process because some drivers can't recover from errors. GPU process will restart shortly.
[0328/082139.595:ERROR:gpu_process_host.cc(990)] GPU process exited unexpectedly: exit_code=34
[0328/082139.596:ERROR:command_buffer_proxy_impl.cc(128)] ContextResult::kTransientFailure: Failed to send GpuControl.CreateCommandBuffer.
[0328/082140.043:ERROR:dxva_video_decode_accelerator_win.cc(1459)] DXVAVDA fatal error: could not LoadLibrary: msvproc.dll: Module introuvable. (0x7E)

Also, here is what I can find in dmesg:

[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32798, for process Diablo IV.exe pid 7578 thread Diablo IV.exe pid 7578)
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x000080000e8c9000 from client 10
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201430
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        Faulty UTCL2 client ID: SQC (data) (0xa)
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        MORE_FAULTS: 0x0
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        WALKER_ERROR: 0x0
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        PERMISSION_FAULTS: 0x3
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        MAPPING_ERROR: 0x0
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        RW: 0x0
[jeu. 28 mars 08:21:18 2024] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered

Here is my system configuration:

Operating System: Gentoo Linux 2.15
KDE Plasma Version: 6.0.3
KDE Frameworks Version: 6.0.0
Qt Version: 6.6.3
Kernel Version: 6.8.2-gentoo-x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 8 × Intel® Core™ i7-9700K CPU @ 3.60GHz
Memory: 31.3 Gio of RAM
Graphics Processor: AMD Radeon RX 7800 XT

Here is my lutris configuration:

Vulkan support: YES
Esync support: YES
Fsync support: YES
Wine installed: NO
Gamescope: YES
Mangohud: YES
Gamemode: NO
Steam: YES
In Flatpak: NO
[System]
OS: Gentoo 2.15 n/a
Arch: x86_64
Kernel: 6.8.2-gentoo-x86_64
Desktop: KDE
Display Server: wayland
[CPU]
Vendor: GenuineIntel
Model: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
Physical cores: 8
Logical cores: 8
[Memory]
RAM: 31.3 GB
Swap: 32.0 GB
[Graphics]
Vendor: AMD
OpenGL Renderer: AMD Radeon RX 7800 XT (radeonsi, navi32, LLVM 17.0.6, DRM 3.57, 6.8.2-gentoo-x86_64)
OpenGL Version: 4.6 (Compatibility Profile) Mesa 24.0.4
OpenGL Core: 4.6 (Core Profile) Mesa 24.0.4
OpenGL ES: OpenGL ES 3.2 Mesa 24.0.4
Vulkan Version: 1.3.275
Vulkan Drivers: AMD Radeon RX 7800 XT (RADV NAVI32) (1.3.274)

If any other information is needed, ask me!

JulienDlq commented 7 months ago

Also, I was using DXVK v2.3 (default). I tested with DXVK v2.3.1 (not default?), same issue.

GloriousEggroll commented 7 months ago

thats a mesa/radv driver issue, not a lutris issue. amdgpu is hitting a hang:

[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:2 pasid:32798, for process Diablo IV.exe pid 7578 thread Diablo IV.exe pid 7578)
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x000080000e8c9000 from client 10
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201430
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        Faulty UTCL2 client ID: SQC (data) (0xa)
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        MORE_FAULTS: 0x0
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        WALKER_ERROR: 0x0
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        PERMISSION_FAULTS: 0x3
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        MAPPING_ERROR: 0x0
[jeu. 28 mars 08:21:08 2024] amdgpu 0000:03:00.0: amdgpu:        RW: 0x0
[jeu. 28 mars 08:21:18 2024] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered

It should be reported to them: https://gitlab.freedesktop.org/mesa/mesa/-/issues

JulienDlq commented 7 months ago

Ok, thanks!

Anyway, I would like to be more accurate for my eventually next reports, what should I look for in order to know if I should report to mesa, or drm/amd, or vkd3d-proton, or dxvk?

GloriousEggroll commented 6 months ago

everything you reported here is fine. I can see between the vkd3d logs and the amdgpu hang its on the side of one of the two. if they need more info they will ask you. the vkd3d devs also regularly contribute to mesa so if it ends up being a vkd3d bug they will see it too and handle it accordingly

I cant really tell you specifically what to look for. Its one of those things where if you've seen it before you recognize where it should be directed, but you wouldnt have known without asking and i wouldnt have known where to point you to without seeing. not a big deal if its in the wrong place as long as it gets redirected to the right place