QW-Group / mvdsv

MVDSV: a QuakeWorld server
GNU General Public License v2.0
59 stars 56 forks source link

BSP: Reduce memory during loadmap. #116

Closed dsvensson closed 1 year ago

dsvensson commented 1 year ago

Read lump by lump instead of allocating memory covering the whole map in the temporary region, and sometimes adding an additional malloc that duplicates the data if lumps are not aligned.

This has become more relevant with mappers including lighting in BSPX lump and higher resolution textures, which isn't relevant to the server, but is still temporarily loaded into the heap before this PR.

tcsabina commented 1 year ago

Hmm, heavy changeset... What shall we do @dsvensson? Are you convinced that this is working fine? Looking at the code I cannot judge.... Have you tested it thoroughly?

dsvensson commented 1 year ago

It's not that heavy, just read commit by commit and you'll see that it's a nice little story.

PR TLDR;

Some benchmarks across 50 loads per map. As this performs seeks back and forth it's slightly slower, but as the community only play three maps they're all in the kernel disk buffer and penalty is insignificant, and "if" next map is same as last, there's no loading. To get the numbers the CM_LoadMap call was surrounded with a for-loop, and the if-same-as-last shortcut was temporarily commented out.

name                   old             new    diff
e2m2          15.374660 ms    15.888823 ms  +3.34% 
dm3            7.957853 ms     8.121897 ms  +2.06%
q3wcp17exp3  189.704343 ms   190.504431 ms  +0.42%
mammoth       65.860233 ms    67.147875 ms  +1.96%

The relevant part is the memory reduction. The resolution is crap as testing was done by lowering -mem parameter as far as possible while still loading the map, and it's at MB resolution.

name            old     new  diff
e2m2          11 MB   10 MB  -10%
dm3           10 MB    9 MB  -10%
q3wcp17exp3   63 MB   29 MB  -54%
mammoth       23 MB   14 MB  -39%

The Id1-maps are really a lot smaller (well; all are), but there's the (non-map-related) delayed packets hunk allocation that sets a baseline of 8 MB hunk size or so.

And as noted, in addition to dramatically reducing the amount of memory needed to load a map, it also fixes a mem-leak in the ezquake embedded version by avoiding the Q_malloc call that used to be in CM_LoadMap.

It was developed by testing the server between each commit, and after finished it's the server I've used for local development since then. It hasn't been used for games, but the PR doesn't change anything other than loading so it's a very isolated functionality. The only thing I haven't tested is the physics normals from BSPX. I don't know what that is, but as I do use maps with BSPX lightmaps, the BSPX header has at least been parsed. If you have such a map I would be happy to give it a try.

dsvensson commented 1 year ago

Hmm, heavy changeset... What shall we do @dsvensson? Are you convinced that this is working fine? Looking at the code I cannot judge.... Have you tested it thoroughly?

Got some QPN (I guess Quake Physics Normals) files from Blood_Dog found here. Had missed seeking a bit in the BSPX path which has been fixed. The server now says:

[2023-02-05 15:13:42] Loading BSPX physics normals
[2023-02-05 15:13:42] Loading external physics normals

Wrote a small utility to embed arbitrary blobs as BSPX lumps that can be used to replicate the above.

I've tried with and without QPN on skull and monsoon without being able to clearly identify what it and pm_rampjump actually does, but there's no change between old mvdsv and new mvdsv that I could observe at least. As the maps on the servers don't contain the BSPX lump, it's probably fairly rare that this feature is used for whatever ramp nuance it adds - and I think both client and server needs the QPN to work, which is why BSPX is the better place for it.

qqshka commented 1 year ago

Overall PR looks fine, fix minor issues it has and it could be merged IMO.

tcsabina commented 1 year ago

@dsvensson @qqshka , is this done? ready to merge?

dsvensson commented 1 year ago

Tried about 10k maps, and none of them failed for any related reasons. ~200 had too many models or ridiculous memory requirements, so the added error handling didn't introduce any issues.