ValveSoftware / Dota-2-Vulkan

Tracker for issues specific to the Vulkan version of Dota 2 on Windows, Linux, and macOS
101 stars 17 forks source link

Command buffers are never freed (vkFreeCommandBuffers) #353

Closed kvark closed 3 years ago

kvark commented 4 years ago

I'm trying to run Dota2 Vulkan on top of DX12 (via gfx-portability), and I'm seeing an out-of-memory problem caused by the fact that Dota2 appears to never free the command buffers. Note that neither calling vkResetCommandBuffer or vkResetCommandPool can replace the calls to vkFreeCommandBuffers.

Could you check if that's the case in your code, and confirm if this is an intentional behavior? cc @danginsburg @msiglreith

danginsburg commented 3 years ago

This is intentional behavior. There are some command line options you could use to have it more aggressively trim command pools using vkTrimCommandPoolKHR (assuming VK_KHR_maintenance1 is exposed):

-vulkan_aggressive_command_pool_rebalancing : this will keep a moving average of the number of command buffers used per frame per pool and trim when the command buffers available per frame is more than 2x the 3-frame maximum

-vulkan_trim_all_command_pools - this is an aggressive option with a high perf cost, but this will trim all command pools each frame.

But in general I would not expect OOM - command buffers are being reused from pools and the resources are returned to the pool in Vulkan so I think this may just be a problem of how it maps to DX12?

kvark commented 3 years ago

@danginsburg I don't think vkTrimCommandPoolKHR (or any other command, except for vkDestroyCommandPool) can replace vkFreeCommandBuffers. Unless I'm reading the specification wrong, you are leaking the command buffers, even though they aren't supposed to have any associated resources in the pool (because you trim/reset either the pool of the buffers). Still, how can leaking command buffer objects be intentional?

danginsburg commented 3 years ago

We are not leaking command buffers, we are letting each thread's command pool grow to a maximum size. I'm not sure how you are concluding that we're leaking command buffers. The pools are created with VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT and the command buffers are reset with vkResetCommandBuffer with zero flags which indicates each one should not release resources.

However, the command buffers are continuously being recycled (although the reason for the trim is sometimes a thread's pool has ended up with more command buffers than it needs).

But we are not leaking command buffers, unless there is some major regression I'm not aware of.

danginsburg commented 3 years ago

Also, there is a case where we call vkFreeCommandbuffers, but it only happens when we decide to trim. The purpose of the trim is that it allows the command pool to actually get rid of those resources in the pool. If you aren't seeing vkFreeCommandBuffer ever then we haven't determined we need to trim. Try those command line options and you should start seeing vkFreeCommandBuffer calls. But they shouldn't be needed, sounds like there is some other issue if you are getting OOM.

kvark commented 3 years ago

Ok, I see. Thanks for the detailed answer!

Fyi, we were able to solve this on gfx-portability side. Now we are hitting the maximum of buffer/texture views created (a limit in dx12 heap, not in Vulkan), which I'm not sure what to do with. Is there an option to limit the descriptor pools Dota tries to create? Should I file another issue for this (technically not a bug in Dota)?

On Oct 7, 2020, at 12:21, Dan Ginsburg notifications@github.com wrote:

 Also, there is a case where we call vkFreeCommandbuffers, but it only happens when we decide to trim. The purpose of the trim is that it allows the command pool to actually get rid of those resources in the pool. If you aren't seeing vkFreeCommandBuffer ever then we haven't determined we need to trim. Try those command line options and you should start seeing vkFreeCommandBuffer calls. But they shouldn't be needed, sounds like there is some other issue if you are getting OOM.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

danginsburg commented 3 years ago

You can limit it with -vulkan_descriptor_sets_per_pool N. By default N = 1024 but you can set it much lower.