RobertBeckebans / RBDOOM-3-BFG

Doom 3 BFG Edition source port with updated DX12 / Vulkan renderer and modern game engine features
https://www.moddb.com/mods/rbdoom-3-bfg
GNU General Public License v3.0
1.37k stars 244 forks source link

Multiplayer: Accessing memory after it has been freed #846

Closed reeFridge closed 3 months ago

reeFridge commented 3 months ago

Operating system and version: OS: Arch Linux x86_64 Host: HP EliteBook x360 1030 G4 Kernel: 6.7.0-arch3-1 CPU: Intel i7-8565U (8) @ 4.600GHz GPU: Intel WhiskeyLake-U GT2 [UHD Graphics 620]

Is this for single player or multiplayer? multiplayer

Description of the bug: commit 8867e865e0b40051977b2145c1129df61f98307f and more specific changes in sys_lobby.cpp lead to Segmentation fault when client connects to the running game session

Steps to reproduce:

on the server

  1. Checkout to the 8867e865e0b40051977b2145c1129df61f98307f
  2. Compile the executable
  3. Run the game as server ./RBDoom3BFG +set fs_basepath ~/games_data/doom3bfg
  4. Create private match with privacy 'open to all' and Deathmatch mode
  5. Start the game

on the client (I use the same host)

  1. Run the game as client and connect to the localhost ./RBDoom3BFG +set fs_basepath ~/games_data/doom3bfg +connect localhost

Result

Server will catch a segmentation fault and exit

What did you expect to happen instead?

Server continue to work normally after client has been connected

Additional info:

More details:

idLobby::Initialize call occurs only once at the game launch (for each instance of idLobby) and allocates lzwData + objMemory idLobby::Shutdown call occurs at Match start and frees lzwData + objMemory

then after client connection

idLobby::SubmitPendingSnap call occurs and use freed lzwData + objMemory at line 224 of neo/sys/sys_lobby_snapshot.cpp

Why is necessary to free memory of lzwData + objMemory at idLobby::Shutdown? (to prevent memory leak I think, but it seems like it must be freed in some other place)

@SRSaunders please help.

reeFridge commented 3 months ago

Is this a race condition related to the snapshot jobs?

UPD: No, it is not. It is just an access to the freed memory.

SRSaunders commented 3 months ago

Thanks for reporting this. The change was indeed for fixing a memory leak, but clearly I broke something unintended. I'll look into this today.

UPDATE: I looked at my changes and now realize I made a bad assumption re idLobby::Shutdown. I can fix this pretty easily, but I am concerned about Multiplayer in general. Even after applying the fix (or reverting my changes), when I connect a client to a non-dedicated server, the client performance is atrocious (<10 fps) with drawing artifacts and missing audio. What do you see on your side? Does it work properly for you with decent performance? If so what is your configuration? I must admit I have not really looked at Multiplayer and wonder if things have degraded there with all the nvrhi platform and architectural changes. I suspect there are networking perf/sync issues with client connections.

reeFridge commented 3 months ago

I haven't noticed anything critical on the side of perf/sync honestly. Sure, on my laptop, I have some issues with perf (fps) at the client side, but it is expected since not so much powerful hardware. I also have a desktop with more decent hardware and a dedicated GPU and have no problems with perf/sync while testing multiplayer on it.

reeFridge commented 3 months ago

So, what should we do next? Is the solution from my WIP PR okay, or do you have a better idea about where to put Mem_Free stuff?

SRSaunders commented 3 months ago

You solution will work fine, but I was just going to put the frees in a destructor for the idLobby class. That way the allocation and free occur in the same module, no other code is needed, and clean up is automatic when the game quits.

idLobby::~idLobby()
{
   Mem_Free( objMemory );
   objMemory = NULL;
   Mem_Free( lzwData );
   lzwData = NULL;
}

I have not submitted it yet since I wanted to add a few other items for MP (e.g. fix locked cvar issues if you want to bake lightgrids and/or environment probes for multiplayer maps, and fix an uninitialized variable in idRenderModelDecal revealed by the multiplayer in-game menu).

I was also trying to track down the slowdown issues I have been facing when connecting a client. Interesting that you don't face this. Can you tell me more about your configuration (e.g. machines and OSs for your server and client, specifics about your autoexec.cfg and/or D3BFGConfig.cfg files, WiFi or hardwired network, etc.)?

reeFridge commented 3 months ago

System: Host: archlinux Kernel: 6.6.2-arch1-1 arch: x86_64 bits: 64 Desktop: i3 v: 4.23 Distro: Arch Linux CPU: 6-core AMD Ryzen 5 2600 GPU: AMD Tahiti PRO [Radeon HD 7950/8950 OEM / R9 280] Network: wired

D3BFGConfig.cfg: https://gist.github.com/reeFridge/f57820660792d360b01257b2f7b92a68

RobertBeckebans commented 3 months ago

I close this because the PR #848 by @SRSaunders already fixes it and it has been merged into master.

SRSaunders commented 3 months ago

@reeFridge thanks for the additional system/config data but I think I found the networking problem as per my merged PR. I would be interested if you tested the latest master and found any differences in client rendering quality and/or performance in your setup. Please let me know.