Closed ghost closed 9 months ago
As a side note, it would be nice to have real error messages, here, it would be nice to be said "out of memory": I would have reported this long ago with such a message. It would also be nice to not display full file path in the logs, instead saying once where the homepath and pakpath are, and then using, dunno, $HOMEPATH and $PATHPAK as strings? It would make the logs less noisy and easier to remove info about how filesystem is organised locally.
As a side note, it would be nice to have real error messages, here.
What suffers from OOM is likely nacl_loader
here, so unless we have a way to produce more logs from nacl_loader
, there is not much things we can do. Like, it's not our software that got the OOM. There are ways to get more logs from nacl_loader
though, but I don't know if OOM is part of them.
By running daemon-tty
over valgrind and switching to some maps in a row, it only reports less than a kbytes wasted.
By doing the same with daemon
, almost all the leakage it reports comes from the OpenGL driver on my end.
So, either I don't reproduce the bug, either we don't free something allocated by the OpenGL driver, if we have too do it ourselves.
I noticed that doing disconnect
while being on a map also increases memory like when loading a new map.
I also noticed loading a new map from main menu while the same map was loaded before (but leaved with /disconnect
) doesn't increase memory.
Loading the same map in a row increases memory.
So it's like some loaded data from map is kept and reused if loading the same map again, but quitting a map always loads some data (going from main menu, or going to the same map again).
Actually the size of loaded images is similar to the leak, I don't know if that's related.
I don't know if pic buffer passed to re.GenerateTexture
is freed.
Some bisecting results when trying to hunt this:
auto-end: 1ff073fa2, leaks 39
auto-end: 25cb8c416, leaks 42
no-run: 6745fd943 43
no-run: 0d77c2135 55
no-run: 8086bc0f8 129
no-run: 8a51faa5e 131
no-run: 966b58afb 138
no-run: 681fb35c1 139
no-run: a60858529 152
python-fail: 73c5cd148 275
python-fail: 7bc60ee34 292
The end number is the distance related to aa0df65ca, the last commit I have in common with you, IIRC. Note that it's not important as I have not changed daemon's code, and the problem is clearly on daemon-side.
So, about the 1st part of those lines:
Those are all unvanquished commit references. I do not intend to work more on this regression for the foreseeable future.
I gave a shot today with my local "RelWithDebInfo" build, which includes ~10 months of changes plus some local personal ones (warning fixes mostly, nothing important):
% ps --no-headers -orss,vsz,comm $(pidof daemon )
249688 1512248 daemon <- cold boot
619316 2112704 daemon <- 1st /map plat23
745252 2226736 daemon <- 2nd /map plat23
783184 2257284 daemon <- 3rd /map plat23
779296 2252868 daemon <- 4th /map plat23
791592 2265984 daemon <- 5
789544 2264700 daemon <- 6
As the results obviously show, the big leak of 150-200 megs is gone for me. Or rather, will be when the release will be done.
Since the release, the game no longer frees lot of memory (more than 100MB on plat23) when a game ends, ultimately leading to crashes for an allocation failure.
On game's menu screen:
Loading thunder:
Re-loading thunder:
One more time:
Restarting game:
Loading anthill:
Re-loading it:
When trying whith plat23, I had similar RAM increase over time, so I do not think it's related to the map itself. Even test-doors increase RAM usage by similar amount.
@cu-kai found this (I only was wondering why game was more crashy than before, suspecting another driver issue or something along those lines).
When it runs for long enough to saturate memory, here's the error message one gets: