Closed drywolf closed 3 months ago
Thanks @drywolf, I've created internal ticket tracking this issue, we'll be looking into this.
Thanks @owenzhangzhengzhong 👍
Here the VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT flag is set, so resources are freed and reallocated in every frame. If set to 0, got the same performance as NV.
My guess is that Nvidia optimizes it and skips the flag so there is no resource reallocation on NV.
Here the VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT flag is set, so resources are freed and reallocated in every frame. If set to 0, got the same performance as NV.
My guess is that Nvidia optimizes it and skips the flag so there is no resource reallocation on NV.
Hi, I am trying to figure out why our Vulkan application is running much slower on AMD GPUs when compared against NVidia GPUs. I have already reduced our code to a somewhat minimal example, which is just running an empty VK render-pass and simply clears the color-image (no draw-commands)
This example app is running with ~13000 FPS on my NVidia RTX 2080 Ti, but is running at just ~350 FPS on my AMD RX 5700 XT card.
In our production code we are using Vulkan for offscreen/headless rendering, but I also reproduced the same lack of performance when using a VK window + swapchain.
This is my Github repo with two "minimal" VK apps for offscreen & window rendering to show the performance issue: https://github.com/drywolf/vsg_amd_perf
Also there is already some discussion about this topic in the following VulkanSceneGraph issue: https://github.com/vsg-dev/VulkanSceneGraph/issues/1208
The big difference between the AMD and NVidia GPU that I was able to see already in the Windows Task-Manager, the AMD rendering of an empty window/empty framebuffer is producing a ~60% "Copy" workload. While on the NVidia GPU there is 0% Copy workload and 30x higher FPS.
I already tried lots of experiments in the Vulkan code & RenderPass / Clear parameters, but so far the performance is always the same. (the most recent thing I tried was to make sure the RenderPass clear parameters were set to
0, 0, 0, 0
as mentioned here, but this also did not change the performance)