leezer3 / OpenBVE

OpenBVE- A free train simulator
http://www.openbve-project.net
275 stars 52 forks source link

Bug - Routeviewer Eating memory after reloading route a dozen or so times #824

Closed Dragon951 closed 1 year ago

Dragon951 commented 1 year ago

Description

Routeviewer Eating memory after reloading route a dozen or so times

Reproduction

IT Occurs in any route if you keep F5 Multiple times

Route

Any Route

Train

Any Train

Logs

No Crashes but if not monitored in Task Manager, routeviewer can eat all your system memory and after a while crash your pc

Related information

leezer3 commented 1 year ago

I've never been able to reproduce anything like this sort of memory leak :( s520 also looked, but never found anything either.

https://github.com/leezer3/OpenBVE/commit/9c067fd4e761792357870f65a7e6c099be97319a will solve a couple of very minor leaks, but nothing at all like you're seeing, more in the neighbourhood of a megabyte or two per reload (nothing at all interesting....) This suggests to me it's either not any route, or there's something system specific at work.

Have you got a link to a public route which reliably leaks memory on your system? Your graphics card would also be interesting, as it could well be driver related.

Kenny-Hui commented 1 year ago

The Route Viewer is leaking memory like that for quite a while already (#784), the memory doubles on every reload. (As if the memory from the previous scene did not get freed) Seems like I can reproduce it consistently regardless of route, GPU and OS as well

Interestingly I can only get the 64-bit one to memory leak on Windows (at the time), so I just continue using the 32-bit version But right now on Linux (Kubuntu to be specific) running it under mono, both 32-bit and 64-bit leaks. image

leezer3 commented 1 year ago

Only thing I can think of doubling would be the textures, everything else is comparative peanuts. Can't see why these would fail to unload either....

Nothing odd like content on network / USB drives was there? That feels filesystem related.

Kenny-Hui commented 1 year ago

It's just loaded from another HDD. I initially suspected it's something to do with the x64 build change (Since I spent most of the time testing in-game rather than Route Viewer) Although not sure whether that's the case.

The Route Viewer closes fine here though, and each memory increase is consistent (Unlike ginga's one)

Maybe I'll do some testing later to trace the specific commit when I have time.

leezer3 commented 1 year ago

That would be interesting.

What would also be interesting if you're running in a debugger which can do memory profiling (Visual Studio does this, not sure about MonoDevelop) is which objects are actually leaking. If you take a memory snapshot before and after the doubling, and then click on the objects diff number, this gives an ordered list of the number of additional objects.

leezer3 commented 1 year ago

Also, as noted in the other issue, if you're using the distribution provided Mono, please try the direct Mono repository version, this is somewhat newer, and there's at least one memory leak bug which got solved in newer versions.

Kenny-Hui commented 1 year ago

That would be interesting.

What would also be interesting if you're running in a debugger which can do memory profiling (Visual Studio does this, not sure about MonoDevelop) is which objects are actually leaking. If you take a memory snapshot before and after the doubling, and then click on the objects diff number, this gives an ordered list of the number of additional objects.

image It's not revealing anything special either, it's something else... (I can still confirm that the memory leak does not occur with 32-bit Route Viewer, only x64 on Windows)

What's more interesting is that I cannot get anything below v1.8.2.0 to launch Route Viewer with 64-bit, it always launches with 32-bit. Although I think that's irrelevant to the memory leak

The leak seems to start happening on 1.8.2.1, so probably something between 1.8.2.0 and 1.8.2.1 *Happened after the Animated GIF got merged, works fine before that so presumably it would be the animated gif PR Although I can't get those to compile on my machine so

leezer3 commented 1 year ago

I'd agree that shows nothing at all interesting, which strongly suggests we're hitting a .Net edge case somewhere.

https://github.com/leezer3/OpenBVE/commit/3f046db32aadea6c14d0a9b179459eda0231b77a

That might help a little, but I'd only expect this to matter at all if you're using large numbers of GIF textures.

leezer3 commented 1 year ago

Something I might have found: Please try this with a route containing no non-english characters in station names / messages (filenames should be ok) and see if it leaks.

Kenny-Hui commented 1 year ago

Still leaks on western route, I've even tried that on the Animated Object Demo Route (Though only leaks at around 10-20MB each) Odd. (It's probably not related to the animated gif decoder itself?)

ginga81 commented 1 year ago

I don't know what the message means, but the file name contains Japanese, but I changed the name of the station to English and read it 10 times. 6.5GB 6.56 6.59 6.59 7.35 8.39 9.36 10.3 11.2 12.1GB

leezer3 commented 1 year ago

Still not able to reproduce any significant leaks :( There's something funny happening with non-english charsets which seems to leak a small amount which I haven't got to the bottom of, but that obviously didn't help much.

Made some minor changes, which are I suspect unlikely to help (removed the original restricted pallette code, as this path is only used by PNGs now), and added a forced compaction of the large object heap on reload.

I'll try and see if I can find anything on Linux next week sometime, but as far as I can see at the minute, this isn't our bug,

Kenny-Hui commented 1 year ago

The memory graph might also be interesting to look at (Also 64bit seems to use double the memory 32bit is using)

32bit: image

64bit (The leaking one): image

There's suppose to be 2 big dip following the several seconds after reloading, but the 64bit one only did 1 small dip and 1 big dip Not sure how much relevancy it has to the leak though

leezer3 commented 1 year ago

I'm not seeing anywhere near double the memory usage, or a graph that looks anything like that :( Was that Linux or Windows?

64-bit will (usually) use more memory (as every address is stored as 8 bytes not 4; https://superuser.com/questions/306751/64bits-vs-32bits-processors-memory-consumption ), but the exact memory usage will depend a little on how the .Net garbage cleaner decides to invalidate objects after a reload.

As a secondary note, on Linux, both executables should execute in 64-bit regardless, and should be functionally identical. (the only change is a modification to the PE header) See this issue on the Mono repository: https://github.com/mono/mono/issues/21038

leezer3 commented 1 year ago

glitch

Hmm, that isn't good. (Can't explain how that's happening at the minute, but if you sit on the signal for the demo route, there's a duplicated second arm for a second as it rotates from the inital position after reload)

That smells related.

leezer3 commented 1 year ago

Can't reproduce that glitch again, which is a little worrying.....

However, I've got a way to make it leak now. Hold down F5, continually triggering a reload and it'll leak.

Observations:

ginga81 commented 1 year ago

I tested what the daily-build version happens. 20211117 not leak 20211120 leak So I guess I got up with the changes during this time.

Dragon951 commented 1 year ago

yea, im also using the daily builds and i feel the memory bug is still happening / same

leezer3 commented 1 year ago

If I'm reading the commit logs right, the only change then was this: https://github.com/leezer3/OpenBVE/pull/698

Only thing I can see that leaking with is lots of trains / TFOs, which doesn't really fit with what KennyHui is seeing with the demo route leaking.

ginga81 commented 1 year ago

I didn't only downloaded 20211117 and 20211120 at the time, and even if I try to download the daily build now, I can only download 2022.1.1 or later. 2121.11.18 or 19, I can't check if it was changed, but was it changed in the last two days?

Kenny-Hui commented 1 year ago

Yes I've checked several days ago that it is indeed unrelated to the GIF decoder itself, just haven't got around to reply yet. (All the testing I've done so far is on Windows) It seems to leak in virtually all routes which is very odd

Though interestingly it does not leak on my Windows 7 VirtualBox VM, and this is the only "machine" where it does not leak memory. (Even VBox on Win10 leaks) The memory graph is also completely different to my testing on other VM/Host OS image It also only uses ~100MB on my (well, unreleased routes), while on my other VM/host it uses up to 200MB on load. Maybe I'll setup another VM later with the same config as the Win7 one and see what caused it

leezer3 commented 1 year ago

There should have been no builds between the 18th and the 20th https://github.com/leezer3/OpenBVE/commits/master?after=6ac6489e6bc145d090d439dafdef0c3f71d3def6+209&branch=master&qualified_name=refs%2Fheads%2Fmaster

Trouble with keeping builds long-term is that there isn't unlimited disk space. (20gb total on the VPS) I normally clear them after 6-7 months or so.

Kenny-Hui commented 1 year ago

btw Is there a way to obtain the latest nightly build (Programmatically) other than to scrap the nightly build site? Ideally I would like to setup a script to auto-download the latest nightly build (Or maybe a batch script to setup a debugging environment that I can just run it on VM and get everything ready), but I can't seems to find a way yet

leezer3 commented 1 year ago

Nope, not at the minute, never thought of that :( Could probably setup a job on the server to update an XML whenever it's pushed & then use much the same as the main menu update button, but that would be a future 'plan'.....

Kenny-Hui commented 1 year ago

Nope, it just leaks on Win10, exact same configuration. Discovered something else though, the memory freed properly when the settings in the "Options" dialog is changed and need reloading.

Ended up with this: https://github.com/leezer3/OpenBVE/blob/6ac6489e6bc145d090d439dafdef0c3f71d3def6/source/RouteViewer/ProgramR.cs#L393

https://github.com/leezer3/OpenBVE/commit/1896d1e689ead4b3dc84155debaed15b7e2b472f (Apr 13) introduced this I think, and I opened an issue on Apr 28, the time frame seems to make sense

Replacing it with Renderer.Reset() just fixes it for me, I assume there won't be any side effect(?).

The memory graph is still different and uses more memory on my machine, maybe they changed the way garbage collection works in later version of .NET Framework? (IIRC I only setup 4.6.2 on my Win7, while Win10+ comes with 4.8), but either way I think the above code is the core issue, just some .NET version coincidentally handled the leak

Dragon951 commented 1 year ago

I'm on windows 11 and it's leaking

leezer3 commented 1 year ago

Replacing it with Renderer.Reset() just fixes it for me, I assume there won't be any side effect(?).

The memory graph is still different and uses more memory on my machine, maybe they changed the way garbage collection works in later version of .NET Framework? (IIRC I only setup 4.6.2 on my Win7, while Win10+ comes with 4.8), but either way I think the above code is the core issue, just some .NET version coincidentally handled the leak

Now there's an interesting observation. https://github.com/leezer3/OpenBVE/blob/master/source/LibRender2/BaseRenderer.cs#L476

The reset function first purges the object and texture caches, then calls the Initialize() function. That would suggest that it's the order the reset is done in that causes the glitch.

leezer3 commented 1 year ago

Yep, that helps quite a bit with the F5 reload leak I can actually reproduce.

Memory leak is reduced to about the same as 1.4.2 with that change. I think the remaining leak is probably the font thing I haven't got to the bottom of yet.

@ginga81 please see if the build from today helps.

leezer3 commented 1 year ago

Most recent build from today also adds a (hidden) option to check for nightly build updates.

For this, please manually edit your options.cfg file, and set dailybuildupdates (In the Interface section) to true. The check for updates function will then look at nightly builds as opposed to release builds.

Not sure where / how to put a GUI checkbox for this yet, we want something like an update channel concept (e.g. stable / nightly channels)

ginga81 commented 1 year ago

I checked the newest daily build. It is not happen leaks. I was reload 10 or more. But it seems not increase.

ginga81 commented 1 year ago

I wanted to be able to download all past daily builds so that I could find out when any of these changes caused problems with any of the daily builds. It will be useful for update check from now on, but what disappeared like this time has no choice but to build. At least if there is an address where the build source for that day was made from here, even I, who can compile for the time being without knowing github, will be able to compile and create the same thing and verify it. Or how about creating multiple free throwaway accounts on Googledrive and sorting them by number of years?

Kenny-Hui commented 1 year ago

Google Drive doesn't seems suitable for project like this though, the best solution is probably to just keep track of it locally.

Assuming you don't need macOS build, you can already store these builds twice as long as the VPS can with 20GB With 200GB of spaces I can probably store these builds for at least the next 31 years :P (Which is unnecessarily long anyway)

ginga81 commented 1 year ago

I tried working on the daily build after the leak fix on 11/4 for a day, but I don't think it leaked. Thanks for the fix! I think this is fine. I think it can be closed if the person who proposed it checks it and finds no problems.

Dragon951 commented 1 year ago

mine is also fixed, thank you