RobertBeckebans / RBDOOM-3-BFG

Doom 3 BFG Edition source port with updated DX12 / Vulkan renderer and modern game engine features
https://www.moddb.com/mods/rbdoom-3-bfg
GNU General Public License v3.0
1.38k stars 247 forks source link

Fix a number of memory leaks #828

Closed SRSaunders closed 4 months ago

SRSaunders commented 6 months ago

This pull request fixes a number of memory leaks spread throughout the code. Most of the changes are applicable to all platforms, with only a few applicable to non-Windows and/or macOS only. The changes are:

  1. Fix a tricky leak with polygon and brush memory in the Collision Model, where insufficient memory was being allocated due to differences in on-disk vs. in-memory footprint of cm_polygon_t.material and cm_brush_t.material. The result was a bunch of singleton allocations that would run, but were orphaned and not deallocated on level changes.
  2. Fixed a leak inside RenderBink() where material data was not being freed before parsing.
  3. Fixed a leak in AddResourceFile() when the file was not found.
  4. Fixed a leak in idPolynomial() due to a missing destructor.
  5. Fixed a leak inside the BinkDecoder library where bundles were not being freed in the ~BinkDecoder() destructor.
  6. Fixed a leak inside LoadShader() where shader blob data was not being released (bad one).
  7. Fixed a leak in idSaveGameThread::Enumerate() where the file handle was not being deleted for Linux & macOS.
  8. Fixed a leak in Sys_InitNetworking() where a network interface address structure was not being released for macOS.
  9. Fixed a leak in Sys_DefaultSavePath() where save path string memory was not being released for macOS.
  10. Fixed a leak in idRenderWorldLocal::FreeWorld() where the lightGridPoints idList was not being released. This would happen across all platforms on any map change during normal gameplay or demo playback.
  11. Fixed a leak on completion of demo playback where the entity joints were not being released.
  12. UPDATED: Fixed a leak in SWF Scripts where refCounts for script object deletes could get confused due to multi-threading, solved using std::atomic<int> for refCounts.
  13. UPDATED: Fixed a leak in idTrigger_Touch() where the clipModel was not being released on level cleanup.
  14. UPDATED: Fixed a leak in idFont, where the glyph data was not being released on cleanup/quit.
  15. UPDATED: Fixed a leak in InitSaveProfile() by replacing MakeReadOnly() with TakeDataOwnership(). This makes the file read-only and enables automatic cleanup of file memory on save complete.
  16. UPDATED: Ran valgrind on linux and fixed leaks and memory access errors: a) free verts and indexes for basic surface triangles, b) free allocations for multiplayer networking, c) fix uninitialized variables used in decision logic, d) use correct array dimensions for float[4] renderparms, and e) properly shutdown SDL on exit via SDL_Quit().
  17. UPDATED: Allocate deviceManger instance in idRenderBackend::Init() vs. R_SetNewMode() to avoid repeated reallocations without delete when looping for viable display resolutions. Delete deviceManger instance during final cleanup in idRenderBackend::Shutdown()

With these changes I can run with zero leaks detected for game startup, level load, level exit, change games, optick profiling, demo playback and recording, and quit for macOS and Linux. I used Xcode Instruments and valgrind leak detection tools for my work, but changes 1-6, 10-15, 16 (items a to d), 17 are applicable to all platforms. Items 7-9, 16 (item e) are Linux and/or macOS-specific. I don't have expertise in running similar leak tools on Windows, so I can't guarantee zero leaks on that platform. However, I have tested the above changes on all platforms and they run correctly without error.

Tested on Windows 10, Linux Manjaro, and macOS Ventura (latest).

Update: I just looked at leaks for demo playback, and unfortunately there is something lurking there. I will try to add an additional fix to this PR if I can find it. Issue is now fixed (see items 10 and 11 above).

SRSaunders commented 5 months ago

@RobertBeckebans, valgrind detected a few more things:

  1. Buffer read beyond boundary problems with mapped uniform (joint) buffer updates due to the hard-coded value uniformBufferOffsetAlignment = 256 being also used for source buffer copy size calculations instead of just for buffer alignment. I am testing a fix for this which I will likely add to the larger push constant pull request due to dependencies. UPDATE: Fixed in this branch for all cache types (vertex, index, joint) in https://github.com/RobertBeckebans/RBDOOM-3-BFG/pull/828/commits/2b5efd9cae744a3726af64c77461051cc33fb45e
  2. UPDATE: Fixed a subtle problem with insufficient image memory being allocated for DXT1 and DXT5 compressed mip images. This would cause read beyond allocated memory boundary problems which valgrind picked up. This was tricky to isolate, but the solution is relatively straight forward in the end. Solved in idBinaryImage::LoadFromGeneratedFile() via https://github.com/RobertBeckebans/RBDOOM-3-BFG/pull/828/commits/a7e796a5957724ad2278250de2e609ca18128df1 and https://github.com/RobertBeckebans/RBDOOM-3-BFG/pull/828/commits/b3dada2668fe0d7ae43ac9f8ee6360d8d1a83810
  3. Problems with premature deallocation of material constantRegisters due to calls to EnsureNotPurged() at the start of idRenderBackend::DrawViewInternal(). Because every frame has a 3D pass followed by a 2D GUI pass, this ends up calling FreeData() before Parse() of the 2D materials which releases the static memory for constantRegisters associated with the 3D materials. It's not a memory leak, but a potential data integrity problem. I am not sure about the impact of this (suspect this was cause of intermittent black screen problems on linux), but it is a structural issue which I don't immediately know how to solve. I can make the problem go away by setting r_useConstantMaterials = 0, but I am not sure if this is the right thing to do. UPDATE: This turned out to be a general issue with the Decl Manager which was not setting the parsedOutsideLevelLoad flag properly. Even if in the parsed state, idDeclManagerLocal::FindType() would reset the flag for all decls on FindType() operations within a level. This was leading to problems with inappropriate purging of non-purgeable materials, ultimately leading to the constantRegisters memory issue above. In addition, the Decl Manager was initing all decls to the non-purgeable state which is incorrect. I am surprised it worked before. The problem is now resolved via https://github.com/RobertBeckebans/RBDOOM-3-BFG/pull/828/commits/05f908c0d20e0f6e1057ecc0e16b5c2cb4d54313, where I also added a tweak to the list\<Decl> console commands to show non-purgeable decls (using the ! symbol).

Here is the valgrind output showing no leaks or memory access errors (game start, level load and quit, timedemo, game quit):

[srs@manjaro]$ valgrind --show-mismatched-frees=no --leak-check=full --track-origins=yes ./RBDoom3BFG

==10468== Memcheck, a memory error detector
==10468== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==10468== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==10468== Command: ./RBDoom3BFG
==10468== 
==10468== 
==10468== HEAP SUMMARY:
==10468==     in use at exit: 403,232 bytes in 3,888 blocks
==10468==   total heap usage: 4,542,031 allocs, 4,538,143 frees, 4,443,189,539 bytes allocated
==10468== 
==10468== LEAK SUMMARY:
==10468==    definitely lost: 0 bytes in 0 blocks
==10468==    indirectly lost: 0 bytes in 0 blocks
==10468==      possibly lost: 0 bytes in 0 blocks
==10468==    still reachable: 401,216 bytes in 3,867 blocks
==10468==         suppressed: 0 bytes in 0 blocks
==10468== Reachable blocks (those to which a pointer was found) are not shown.
==10468== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==10468== 
==10468== For lists of detected and suppressed errors, rerun with: -s
==10468== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
RobertBeckebans commented 4 months ago

Thank you @SRSaunders and Merry Christmas! I reviewed your pull request and I think you also found some issues that would explain some bugs. I will let the people on Discord playtest this so this was rebased on merged into master.

SRSaunders commented 4 months ago

@RobertBeckebans You are very welcome, and Merry Christmas to you as well!

I have a few minor valgrind fixups that I was going to add. Now this branch is closed should I push those independently, or onto my push-constants branch? I'm not sure what you are planning with that one. Take care and enjoy the holidays.