ConfettiFX / The-Forge

The Forge Cross-Platform Rendering Framework PC Windows, Steamdeck (native), Ray Tracing, macOS / iOS, Android, XBOX, PS4, PS5, Switch, Quest 2
Apache License 2.0
4.8k stars 501 forks source link

Driver crash with Radeon Mesa driver (RADV) on Linux #117

Closed boberfly closed 5 years ago

boberfly commented 5 years ago

Hi all,

I've found a segfault crash with the Vulkan backend on Radeon Mesa (RADV). I built the driver in debug optimised mode so I could see exactly where the crash happens: https://github.com/ConfettiFX/The-Forge/blob/master/Common_3/Renderer/Vulkan/Vulkan.cpp#L474

Which is due to VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV not being anticipated by the RADV driver here: https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/vulkan/radv_descriptor_set.c#L656

My guess is perhaps this value should be more dynamic in pRenderer rather than at compile time? I'll let you guys decide... :) https://github.com/ConfettiFX/The-Forge/blob/master/Common_3/Renderer/Vulkan/Vulkan.cpp#L436

Cheers!

boberfly commented 5 years ago

Possible fix I made here: https://github.com/ConfettiFX/The-Forge/pull/118

wolfgangfengel commented 5 years ago

Hey @boberfly we do not support the Mesa driver package. Only the drivers mentioned on there. If we start to support any possible driver linux combination our hardware test farm would grow exponential.

wolfgangfengel commented 5 years ago

... we also do not provide workarounds for driver bugs ...

boberfly commented 5 years ago

Hi @wolfgangfengel I think in this case the Mesa driver is more pedantic about what is passed to the descriptor pool here where VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV isn't supported on this driver, so it's not necessarily a bug. I think this driver is worth supporting because SteamOS uses this as their main driver. By the looks of it the official Intel vulkan driver on linux will have the same issue potentially: https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/intel/vulkan/anv_descriptor_set.c#L95

wolfgangfengel commented 5 years ago

That makes sense. So we should switch to supporting this driver instead of the others? If you can choose two on Ubuntu, which one would you choose?

boberfly commented 5 years ago

To be honest it's hard to say, on one hand the quality of the Mesa drivers are very high and optionally having the debug optimised drivers might be invaluable for picking up very subtle problems like this that the other drivers didn't pick up on. Most Linux users will be using this driver by default I would say, especially on later Ubuntu versions which have a more recent Mesa driver out of the box and have more success with it in Steam and running that Proton compatibility layer for Windows games. On the other hand the official AMD driver taps into their debugging suite Radeon GPU Profiler which I am not sure that the Mesa one can use.

On the test hardware and exponential issue, the situation might not be that bad as both drivers can sit side by side on the same machine as they are really just tapping into the same kernel driver amdgpu. What I did to explicitly set the driver I want is set the environment variable before running: export VK_ICD_FILENAMES=/path/to/icd.json or VK_ICD_FILENAMES=/path/to/icd.json ./01_Transformations Would running the tests twice be reasonable if both drivers can sit on the machine at the same time without reboots?

I'd like to try more unit tests here and compare the two drivers more and get back to you. Cheers!

wolfgangfengel commented 5 years ago

We use Mesa for testing now