DiligentGraphics / DiligentCore

A modern cross-platform low-level graphics API
http://diligentgraphics.com/diligent-engine/
Apache License 2.0
631 stars 136 forks source link

Memory leak when calling UpdateTexture function with DXD12 and Vulkan #145

Open BugzTroll opened 4 years ago

BugzTroll commented 4 years ago

Setup Using the latest version of DiligentEngine with :

I'm getting memory leaks when calling device->UpdateTexture(...) multiple times using DXD12 backend.

How to reproduce An easy way to reproduce this is by simply creating a texture with device->createTexture(texDesc, nullptr, &texture) and then calling context->UpdateTexture(...) in a for loop (in my case I tested with a loop of 20). You can see the memory go way up and never go back down. The issue is visible even more when using big textures(4K), or textures with Mipmaps.

Results with other backends The issue does not occur when using OpenGL and DxD11 backend. The issue also occurs with Vulkan backend.

Output In the output window there is a bunch of these: Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x284090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x288090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x28c090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x290090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x294090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x298090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x29c090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2a0090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2a4090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2a8090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2ac090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2b0090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2b4090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2b8090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2bc090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2c0090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2c4090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2c8090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2cc090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2d0090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2d4090000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2dc200000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2e0200000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2e4200000 Diligent Engine: Info: Created dynamic memory page. Size: 64.00 MB; GPU virtual address 0x2e8200000

Details Here's the descriptor for the texture I'm creating :

Diligent::TextureDesc texDesc;
texDesc.Name = "tex";
texDesc.Type = Diligent::RESOURCE_DIM_TEX_2D;
texDesc.Width = 4096; // we can see the leak faster with big texture
texDesc.Height = 4096; // we can see the leak faster with big texture
texDesc.Format = Diligent::TEX_FORMAT_RGBA8_SINT;
texDesc.CPUAccessFlags = CPU_ACCESS_WRITE;
texDesc.BindFlags = BIND_SHADER_RESOURCE;
TheMostDiligent commented 4 years ago

If you do this in the loop, you have to explicitly Flush the device context. The engine allocates temporary staging memory for every update. This memory is locked until the command buffer is submitted and executed, which does not happen in your scenario.

BugzTroll commented 4 years ago

Flushing the device context doesn't change anything in this case, the memory seems to be still locked.

TheMostDiligent commented 4 years ago

Do you flush it in the loop after every update or after the loop? It needs to be in the loop.

BugzTroll commented 4 years ago

It's done in the loop. In any case, once the updates are done the memory is supposed to be liberated at some point but when I inspect the memory used by my application it never goes down as it does when using DXD11 or OpenGL

TheMostDiligent commented 4 years ago

Ok, there is one more method you have to call in the loop: FinishFrame

So the way memory management works is quite involved: when you call UpdateTexture, the engine has allocate temporary staging memory that CPU can write to and GPU can read from. This memory is actually the same as used for dynamic buffers, which is why it mentions dynamic memory page. Until you flush the context, the command buffer is not even submitted for execution, so every call to UpdateTexture will allocate new memory. When you flush the context, the memory goes into stale state. Stale resources are typically recycled once a frame when present is called. In your case you have to tell the engine to recycle dynamic resources out-of-order by calling FinishFrame.

TheMostDiligent commented 4 years ago

There is another method that is related to this. It however releases non-dynamic stale resources.

Remember that Diligent is a low-level library, so sometimes it requires more explicit control of what is going on as opposed to D3D11 or OpenGL.

BugzTroll commented 4 years ago

Yes I noticed that method , and this one is called at the end of the function SwapChainD3D12Impl::Present, so technically the stale resources should be released at the end of the render loop.

The problem I have at the moment is that in my application, if I use UpdateTexture to update the data of my texture, I'm getting memory leaks and I did not find a way to clear the memory that is allocated by that function. My workaround was to not use that function to update the texture and to set the data at the beginning instead. I think there might be an issue with this function so that's why I tried to recreate the problem using a sample in Diligent and updating the texture in a loop to see If I observe the same behavior.

TheMostDiligent commented 4 years ago

Like I tried to explain, the engine has to allocate the memory every time you call UpdateTexture because there is no other way to do this. If the command buffer has not been submitted, the GPU has not even started working on the commands, so all the update data has to be kept somewhere. After you flush the context, the command buffer goes to the GPU, but the memory is still locked until the GPU is done with the copy and the engine finds out it can recycle it. If you call UpdateTexture 20 times a frame, the engine has to allocate 20 copies, there is no way around. Normally, the GPU is 1-2 frames behind, so if you want to update 20 4K textures every frame, the engine will have to allocate enough memory to store 40 4k texture with 2-frame delay.

D3D11 probably does something similar under the hood, but likely starts to idle the GPU and forces the GPU-CPU synchronization at some point. Diligent does not do this for you because it tries to be explicit. Instead it gives you tools to avoid the situation and implement the solution that is preferable.

The memory is not leaking, it is the engine can't reuse it because how the application issues the commands.

BugzTroll commented 4 years ago

Oh I missed one of your comments earlier. Ok I think I understand.

I was not trying to update the 4K texture every frame, it was done only one time on initialization and after that, the memory was never released. I thought that once the GPU would be done with the commands, the resources allocated by the UpdateTexture function would be released since the function Present calls Flush and FinishFrame.

But from what I understand, I need to call Flush and then FinishFrame manually after each call to UpdateTexture to make sure that we are done with the resources and that they can be freed. Thanks a lot for your time, this was very helpful !

TheMostDiligent commented 4 years ago

Yes, the engine currently does not release dynamic memory. When you a create resource, the logic is basically the same as when you update it. If you don't want to waste a lot of memory when you initialize resource, you can call Flush and FinishFrame after creating every resource. To be absolutely sure, you can also WaitForIdle