Open fredyshox opened 7 months ago
I tried to investigate this and get some basic conclusions now.
Firstly, if the onscreen/offscreen rendering is turned off, this won't happen. Only about 1 MB of memory leaks per close-reset. I then turned on a magic key called debug_physics_world
which bypasses all asset loading but still opens up a window. The leakage remains about 1MB per close-reset, which suggests it has nothing to do with Panda's rendering service. It must be that some assets fail to be destroyed. Finally, by ablating rendering for different objects, I find that the terrain is not destroyed and remains in memory even if the close
is called. It indeed alleviates the problem a lot but still causes about 50MB leakage per close-reset...
My branch for fixing this is at: https://github.com/metadriverse/metadrive/tree/fix-memeory-leak
The test script is at: https://github.com/metadriverse/metadrive/blob/fix-memeory-leak/metadrive/tests/test_sensors/test_close_reset_for_3d_render.py
Memory usage (0): 2962.91796875 MB
Memory usage (1): 3140.48046875 MB
Memory usage (2): 3226.16796875 MB
Memory usage (3): 3302.96875 MB
Memory usage (4): 3380.62109375 MB
Memory usage (5): 3439.1796875 MB
Memory usage (6): 3481.4140625 MB
Memory usage (7): 3532.0390625 MB
Memory usage (8): 3574.7578125 MB
Memory usage (9): 3625.41796875 MB
Process finished with exit code 0
My result is above in latest branch. Seems like this issue is greatly alleviated.
Some thing related to rendering is still left in the memory
MetaDriveEnv seem to leak some memory, when it's repeatedly created and closed.
The memory usage is growing with each instantiation of the environment, and is not released after closing/deletion, which makes it difficult to run multiple rollouts with different setups within single process.
Example:
It happens both in onscreen/offscreen modes. Despite calling
close
and explicitely deleting the environment, each of them is leaking approx 2 GB of memory:Its possible that i'm doing something wrong here, is it the correct way to release the environment?
After short investigation using
tracemalloc
the majority of leaked memory does not seem to be allocated by python code (panda3d?).