jupyter-widgets / pythreejs

A Jupyter - Three.js bridge
https://pythreejs.readthedocs.io
Other
936 stars 187 forks source link

Garbage collection for embedded html output #217

Closed jpanetta closed 5 years ago

jpanetta commented 5 years ago

The output I get from ipywidgets.embed.embed_minimal_html includes buffers/meshes that are no longer referenced by anything, which in my application leads to files that are many times larger than necessary. Is there a way to force these "leaked" objects to be deleted?

Example code:

from pythreejs import *
from ipywidgets import embed

meshes = Group()
scene = Scene(children=[AmbientLight(intensity=0.5), meshes])

vertices = BufferAttribute(array=np.array(
        [[-1.0, -1.0,  0.0],
         [ 1.0, -1.0,  0.0],
         [ 1.0,  1.0,  0.0]], dtype=np.float32), normalized=False)
index = BufferAttribute(array=np.array(
        [[0, 1, 2]], dtype=np.uint16).ravel(), normalized=False)
geom = BufferGeometry(attributes={
        'position': vertices,
        'index': index,
    })

meshes.add(Mesh(geometry=geom, material=MeshLambertMaterial(color='red', side='DoubleSide')))

c = PerspectiveCamera(position=[0, 0, 3], up=[0, 1, 0])
controls = OrbitControls(controlling=c)
r = Renderer(camera=c, scene=scene, controls=[controls])
embed.embed_minimal_html('tri_embed.html', views=r)

attrib = geom.attributes
attrib['position'] = BufferAttribute(array=np.array(
        [[-1.0, -1.0,  0.0],
         [ 1.0, -1.0,  0.0],
         [ 1.0,  0.25,  0.0]], dtype=np.float32), normalized=False)
meshes.children = [Mesh(geometry=BufferGeometry(attributes=attrib), material=MeshLambertMaterial(color='red', side='DoubleSide'))]

embed.embed_minimal_html('tri_embed2.html', views=r)

This generates the two attached HTML files. The file tri_embed2.html still has a copy of the old MeshModel object c5cbf34f3e844d4996cfb7ae755bbf4b, which is no longer referenced by anything in the JSON state representation.

tri_embeds.zip

Update: I've tried running gc.collect() before the export and I'm still getting the old objects.

vidartf commented 5 years ago

There is an official way to resolve this problem using ipywidgets.embed. The default methods includes the state of all widgets that are available in the kernel (not closed with widget.close()).

What you can do is to call the embed function like this:

state = embed.dependency_state(r)
embed.embed_minimal_html('tri_embed2.html', views=r, state=state)

The dependency_state function will try to automatically determine the dependencies of r, and include them in the state. This is not done by default, since it is not completely fool-proof, but I think it should work well for any pythreejs widget structure.

jpanetta commented 5 years ago

Got it, thanks for pointing me to dependency_state. Unfortunately, it doesn't pull in the morphAttributes, dependencies, since those are stored as a dictionary entry of tuples of Widgets. This can be fixed by editing ipywidgets.embed._find_widget_refs_by_state:

def _find_widget_refs_by_state(widget, state):
    """Find references to other widgets in a widget's state"""
    # Copy keys to allow changes to state during iteration:
    keys = tuple(state.keys())
    for key in keys:
        value = getattr(widget, key)
        # Trivial case: Direct references to other widgets:
        if isinstance(value, Widget):
            yield value
        # Also check for buried references in known, JSON-able structures
        # Note: This might miss references buried in more esoteric structures
        elif isinstance(value, (list, tuple)):
            for item in value:
                if isinstance(item, Widget):
                    yield item
        elif isinstance(value, dict):
            for item in value.values():
                if isinstance(item, Widget):
                    yield item
                # Also support tuples of Widgets within dicts
                if isinstance(item, tuple):
                    for item2 in item:
                        if isinstance(item2, Widget):
                            yield item2

(I added the last 4 lines to the original implementation.)

Do you think this is a reasonable change to make upstream? The documentation for dependency_state explicitly warns that the algorithm does not find widget references in nested list/dict structures, so maybe they want to keep the implementation minimal.

vidartf commented 5 years ago

maybe they want to keep the implementation minimal.

It was mostly an issue of having the function avoid having to deal with recurrent references (e.g. two dicts containing a reference to each other). It is something that can be fixed by adding a memo of seen objects, similar to what deepcopy uses. Then it can be made recursive. It is not overly complicated, just something I did not have the time to include initially.