Open SRSaunders opened 1 week ago
Maybe we can add an enum to the device manager where we scan for Nvidia, AMD and Intel vendor devices and then have a cvar like in former times r_useIntelHacks wrapped around by #if defined(linux).
I remember those old r_skipIntelWorkarounds
cvars. However, they were used to influence the setting of booleans for feature availability (e.g. timer queries). In this case, the HiZ depth buffer feature is already controlled by the cvar r_useHierarchicalDepthBuffer
. So using one cvar to set another seems over-complex unless there are other GPU-specific cases which I am not aware of at this point.
I would rather use conditional #if defined(linux, mac)
logic combined with integrated GPU detection to directly control the r_useHierarchicalDepthBuffer
cvar. Note my original suggestion above avoids both of these options and just disables it for iGPUs on all platforms. While required on linux and macos, I realize this might be unnecessary on Windows. However, in the past I have seen memory problems which cause crashes on linux be more general in nature, and while they don't crash on Windows there is indeed something wrong underneath. Linux can act like a "canary in the coal mine" for subtle issues. That's why I asked earlier if you want to look at this to see if you could see something wrong with mipmapgen (e.g. sync, memory, etc) that I have missed.
I can look into the mipmap gen pass but it's pretty much just copied from the Donut framework and it does not trigger any validation errors so it is more likely we have a driver bug on Linux. Is it the same behaviour on macOS?
Is it the same behaviour on macOS?
Yes, when running the master branch on Apple Silicon/arm64 (my M1 laptop), the HiZ buffer feature can cause an out-of-device-memory crash. It does not happen on my x86_64 machine with a discrete AMD 6600XT GPU. And I can't test using macOS with an Intel iGPU since my ancient Apple Intel laptop is too old and won't run the game - but that combo is mostly irrelevant these days in any case.
I am suspicious that this is not a driver bug since it happens in two different environments: linux and macOS. And a clue when running on macOS with Apple Silicon: when I enable either new SSAO or TAA or both, the crash does not appear with HiZ enabled on the master branch. When both new SSAO and TAA are disabled, the crash happens immediately with HiZ enabled. Without HiZ no crash occurs in any situation. That's what makes me suspicious about a compute shader resource or sync issue with HiZ enabled.
When the Hierarchical Depth Buffer is enabled (either for the old SSAO, or the new Filmic Post FX passes), I see problems on Integrated GPUs as follows:
If I set
r_useHierarchicalDepthBuffers = 0
the problem goes away for both cases above. I have looked at the mipmapgen shader code and played with the parameters a bit, but I don't see anything obviously wrong, other than potential memory issues. I am wondering whether there is a subtle sync + resource release delay problem that is exposed on iGPUs that are slower and have more limited memory.If you don't want to research this, I can easily patch it by adding the following code to DeviceManager_VK.cpp:
Please let me know what you want to do with this.