Open aufkrawall opened 4 years ago
Sadly there is no way I can debug this. I have neither the game nor the hardware. Maybe the mesa guys find a fix that I can look at.
Maybe i found the problem 6f541b1a9ec5d7acf1c29309caf52daa8d5d1952. Could you try this build: vkBasalt.tar.gz
I've patched the commit into 0.1.0, but it unfortunately doesn't affect the issue.
Edit: With the source you've provided, Doom and SotTR (Linux native) crash at start. This also applies to the SMAA branch, which the linked source probably is? :)
Does the master branch work? There is a build:
https://github.com/DadSchoorse/vkBasalt/issues/30#issuecomment-552176075
Oh and please make sure you do not use an old vkBasalt.conf
Master branch crashes too, also when ~/.local/share/vkBasalt/vkBasalt.conf
is deleted.
Unfortunately, there doesn't seem to be any interesting terminal output when it happens.
Oh sorry the ~/.local/share/vkBasalt/vkBasalt.conf
should stay. That file is changed on install of a new build. I was referring to vkBasalt.conf
in the game folder. And please give me the terminal output. You also could try enabling the validation layers with VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation
if you have those installed.
Afair the AUR PKGBUILD doesn't create the user config at that path by default, but I've copied the updated example config there manually.
Also vkcube crashes: vkbasalt.log
Reading the crash log and looking at the files of the package:
There is only a full_screen_rect.vert.spv
included, but not a full_screen_triangle.vert.spv
. And for some reasons, it looks for that file at the .local/share
path, where only the config is located?
So you do not have a file called ~/.local/share/vkBasalt/shader/full_screen_triangle.vert.spv
?
Yeah don't use the PKGBUILD with newer versions...
So you do not have a file called
~/.local/share/vkBasalt/shader/full_screen_triangle.vert.spv
?
Nope. :) I'll try compiling without PKGBUILD.
@DadSchoorse It works with the source you provided in https://github.com/DadSchoorse/vkBasalt/issues/34#issuecomment-554237618 , fantastic. :)
Funny observation: It also "fixes" the Mesa overlay, and the fps are still as expected with async compute. It just doesn't "fix" the Steam overlay fps counter, it still reduces performance in Doom.
Do you think the Mesa overlay could implement something similar for the overlay?
Well vkBasalt waits until the game wants to present the rendered frame and does all it needs to do after that. Then it presents the frame. I don't know how the mesa overlay works but that should be possible for any layer manipulating the final frame.
@DadSchoorse We now unfortunately have another issue: The input lag gets increased by at least one frame now with async compute.
How did you measure input lag?
I don't measure it, but the difference with mouse input in the game is very obvious when switching between 8xTSSAA (async compute on) and FXAA (async compute off). This is not the case without vkbasalt.
Maybe I see something really obvious in the code but I don't think I have a solution for that.
@aufkrawall How does the async-compute-based presentation work? Can you link to any docs on it?
I'm looking for something to do and might tackle this. I have both GCN and Navi cards, and I can pick up the game if need be.
@mcoffin There recently was an explanation by @Plagman in the Mesa ticket conversation: https://gitlab.freedesktop.org/mesa/mesa/issues/946#note_404377 Should be much more illuminating than what I could ever contribute on the matter.
Perhaps Navi isn't affected in the same way as GCN, at least there are differences with Navi vs. GCN on Windows regarding RTSS overlay in this regard.
I don't have an AMD card installed anymore, so I unfortunately can't try out whatever comes up.
this issue is unsolvable unless we implement a rasterizer that only uses compute
@DadSchoorse
this issue is unsolvable unless we implement a rasterizer that only uses compute
Would this be within scope or would it require a ton of efforts?
For what it's worth: ReShade Vulkan compatibility has increased nicely. Unlike vkbasalt, it doesn't have increased input latency when a game presents from compute queue, but instead it reduces performance further.
On the GTX 1070 GPU when enabling just LUT shader: -7% in Doom Eternal (uses async compute and presents from compute queue) vs. -2.3% in Strange Brigade (uses async compute, but doesn't present from compute queue). I suppose it could cost more with GPUs that have better async compute capabilities. May I ask if you're aware of it, @crosire ?
Edit: As a side note: Steam and RTSS (beta) overlays recently have received improved compatibility with async present, they now can draw onto the frame buffer without noteworthy performance hit.
ReShade synchronizes its rendering queue (which is always the first graphics queue) with the queue being presented on. If the game submitted additional work to the same graphics queue in the meantime, then that will make performance drop (since more work is executed before the actual present). This is probably what is happening in DOOM. I suppose this could be fixed by creating a dedicated graphics queue just for ReShade and synchronizing that with the present queue. But then things get a lot more complicated with ensuring resources are synchronized between these queues (e.g. when accessing the depth buffer, which belongs to the game, in a ReShade shader). So for ReShade it is much simpler the way it works right now, which assures correct behavior, but at the cost of some performance.
A dedicated queue is unlikely to completely fix the problem, as the hardware only has one graphics pipe. Your additional queue would potentially let the OS schedule you into an earlier submission gap, but the game submissions tend to be chunky, so I think you'd see more or less the same thing.
Unlike vkbasalt, it doesn't have increased input latency when a game presents from compute queue, but instead it reduces performance further.
I could probably also achieve that, with the question being if input latency or a performance loss is worse.
A dedicated queue is unlikely to completely fix the problem, as the hardware only has one graphics pipe.
And only NVIDIA exposes more than one graphics queue anyway.
this issue is unsolvable unless we implement a rasterizer that only uses compute
Would this be within scope or would it require a ton of efforts?
The build in effects always draw a full screen triangle, making those work in compute is pretty easy (The first release drew with a compute shader already). The problem is that reshade shaders can draw arbitrary triangles (and other primitives) so that is pretty much impossible to do in a generic, high performant way.
Edit: As a side note: Steam and RTSS (beta) overlays recently have received improved compatibility with async present, they now can draw onto the frame buffer without noteworthy performance hit.
I suspect that these are drawing with compute shaders now.
Thanks for your responses!
RTSS developer Unwinder shared some information about the general concept. Can't judge if it contains any hint for something that isn't already general knowledge:
Added alternate asynchronous On-Screen Display renderer for Vulkan applications presenting frames from compute queue (id Tech 6 and newer engine games like Doom 2016 and Dooom Eternal). The implementation is using original AMD’s high performance concept of asynchronous offscreen overlay rendering and the principle of asynchronically combining it with framebuffer directly from compute pipeline with compute shader without stalling compute/graphics pipelines. ...
https://forums.guru3d.com/threads/rtss-6-7-0-beta-1.412822/page-118#post-5776692
OMG!! Good to know that this issue is related to Vkbasalt. It is unbearable to play the game like this.
Yeah, I think it would be better to lose a bit performance instead.
@DadSchoorse Would it be possible to consider this in your current rewrite approach?
GCN can present frames from a hardware compute queue to improve performance. Doom uses this when the game's anti-aliasing is either set to disabled or 8xTSSAA. When combined with vkBasalt, these artifacts occur (tested with radv): The white stripes disappear when turning off async compute by switching to a different AA mode like FXAA.
It would be nice if these artifacts could be fixed without decreasing performance. Doom is not the only example, this e.g. also applies to Rage 2 (at least it should, as it uses async compute as well).
Mesa overlay also has issues with it: https://gitlab.freedesktop.org/mesa/mesa/issues/946#note_246418