cmbruns / pyopenxr

Unofficial python bindings for OpenXR access to VR and AR devices
Apache License 2.0
117 stars 9 forks source link

ContextObject.view_loop() has a memory leak #116

Closed DaFluffyPotato closed 1 month ago

DaFluffyPotato commented 2 months ago

Testing the pink_world.py example results in a memory leak (RAM not VRAM).

I added pympler to verify the source of the leak:

from pympler import tracker
from OpenGL import GL
import xr

# ContextObject is a high level pythonic class meant to keep simple cases simple.
with xr.ContextObject(
    instance_create_info=xr.InstanceCreateInfo(
        enabled_extension_names=[
            # A graphics extension is mandatory (without a headless extension)
            xr.KHR_OPENGL_ENABLE_EXTENSION_NAME,
        ],
    ),
) as context:
    tr = tracker.SummaryTracker()

    for frame_index, frame_state in enumerate(context.frame_loop()):
        for view in context.view_loop(frame_state):
            GL.glClearColor(1, 0.7, 0.7, 1)  # pink
            GL.glClear(GL.GL_COLOR_BUFFER_BIT)

        if frame_index % 300 == 0:
            tr.print_diff()

Which eventually stabilizes after initialization and continuously yields the following diff:

                                       types |   # objects |   total size
============================================ | =========== | ============
                                        dict |        6300 |      1.21 MB
                                  memoryview |        1200 |    215.62 KB
                         xr.typedefs.Rect2Di |        1200 |    150.00 KB
               xr.typedefs.SwapchainSubImage |        1200 |    150.00 KB
                               managedbuffer |        1200 |    150.00 KB
                  xr.typedefs.c_long_Array_2 |        1200 |    150.00 KB
  xr.typedefs.CompositionLayerProjectionView |         600 |     75.00 KB
                       xr.typedefs.Extent2Di |         600 |     75.00 KB
                       xr.typedefs.Offset2Di |         600 |     75.00 KB

I've watched the program go from ~100MB to several hundred MB of usage before killing it.

Commenting out the following lines from pink_world.py completely removes the leak:

for view in context.view_loop(frame_state):
    GL.glClearColor(1, 0.7, 0.7, 1)  # pink
    GL.glClear(GL.GL_COLOR_BUFFER_BIT)

It seems that the contents of ContextObject.render_layers are not getting deallocated even though ContextObject.frame_loop() does a self.render_layers = [] prior to yielding each frame.

I believe all the examples including hello_xr.py have leaks.

I might look more into the issue later if I have the time.

DaFluffyPotato commented 2 months ago

I looked a bit more. It looks like it's the CompositionLayerProjectionViews briefly stored in ContextObject.render_layers that don't get deallocated. The other xr.typedef items appear to be children of CompositionLayerProjectionView.

Here's some useful ref tracing from pympler on CompositionLayerProjectionView:

<class 'xr.typedefs.CompositionLayerProjectionView'>(1987851081552)-+-<class 'xr.typedefs.SwapchainSubImage'>(1987851082576)--<class 'xr.typedefs.Rect2Di'>(1987851081168)--<class 'xr.typedefs.Offset2Di'>(1987851082960)--<class 'dict'>(1987825452224)({'_wrapper': xr.Offset2Di(x=0, y=0)})
                                                                    +-<class 'xr.typedefs.SwapchainSubImage'>(1987851083344)--<class 'xr.typedefs.Rect2Di'>(1987851083472)--<class 'xr.typedefs.Extent2Di'>(1987851083600)--<class 'dict'>(1987825452544)({'_wrapper': xr.Extent2Di(width=2064, height=2272)})

I had the dictionary print out its contents. I'm guessing it's related to this line in typedefs.py in the Extent2Di case.

Comparing id()s of the SwapchainSubImages indicates that the children of CompositionLayerProjectionView are separate from the SwapchainSubImages that references CompositionLayerProjectionView.

cmbruns commented 2 months ago

Thank you for investigating this. It does look like those .as_numpy() methods might be creating a circular reference that inhibits garbage collection. I don't remember at the moment if I created those methods just to allow python sequence-like access e.g. 'foo[key] = bar' for sequency openxr data types, or if there was some other reason to allow persistent access to the result of as_numpy().

DaFluffyPotato commented 2 months ago

The pympler output was exploding exponentially as I increased depth before, but I got hold of a Extent2Di reference (one of the parent referers of CompositionLayerProjectionView) to perform another reference check on and got the following:

<class 'xr.typedefs.Extent2Di'>(1535035926096)-+-<class 'dict'>(1535013239104)({'_wrapper': xr.Extent2Di(width=2064, height=2272)})--<class 'xr.typedefs.c_long_Array_2'>(1535035926224)--<class 'managedbuffer'>(1535035926336)--<class 'memoryview'>(1535035883136)

Between this and the other reference check, that includes all the types that were found as a part of the memory leak before. Increasing the pympler depth from 4 to 5 on the Extent2Di reference segfaults. Directly checking the referers of memoryview with pympler's node trees yields no referers.

DaFluffyPotato commented 2 months ago

The deletions I made in the PR fixed the issue for me. I've never had to do lifetime hacks for anything before, so my fix could be very wrong. lol

DaFluffyPotato commented 1 month ago

Closing this since #117 was merged.