cdglabs / apparatus

A hybrid graphics editor and programming environment for creating interactive diagrams.
http://aprt.us/
MIT License
1.03k stars 58 forks source link

Added findUnnecessaryNodes #47

Closed joshuahhh closed 8 years ago

joshuahhh commented 8 years ago

Per #44, this PR adds a method to find "unnecessary nodes". It does not yet do anything with them.

It's fun to play with: if you clear out localStorage, restart Apparatus, and run this in the console, it returns an empty array. If you add some shapes to your drawing, it still returns an empty array. But if you delete some shapes, it will return their bits, reporting them as uncollected garbage. Even if you press "New", the garbage will still stick around.

We probably want to do something about this. We can make the deletion methods more intelligent about identifying when something is no longer in use and can be removed entirely. OR we can just regularly find and remove unnecessary nodes as a pre-serialization GC thing, using this method. I am partial to this latter approach, since it means we don't need to be careful to avoid leaks at every turn – we can just detect them in one fell swoop.

Please let me know what you think, @electronicwhisper.

🎄🎄🎄🎄🎄🎄🎄

electronicwhisper commented 8 years ago

Yes, this garbage collection problem is due to how Node works as I'm sure you've put together. The nodes keep references to their variants, even if those variants are removed from any tree. And because of how we serialize, these objects persist forever.

In the short term, I think this GC approach you've implemented will work. Running it before serialization to a file makes sense. Alternatively, maybe it could be run whenever you do Project.removeSelectedElement or Editor.createNewProject. Those are probably the largest culprits of garbage making.

Here's a little gotcha to be aware of: When you press "New", the garbage sticks around, but it doesn't get into the serialized project file (I think). The reason is that the garbage is referenced only from built in objects (Circle, Rectangle, etc.). These built in objects are not serialized but instead just get tagged as built in. This is a complexity that has to be kept in mind when thinking about the serialization/deserialization logic. It may have other implications, I'm not sure. :disappointed:

I would like to, in the longer term, investigate redoing the implementation of Node. I don't like how the implementation stores and manages redundant data (it is not "normalized"). I don't particularly like all the duplication that happens as well, how entire structures need to be copied. This becomes costly in the case of recursive shapes. We can discuss this later in January and perhaps also get Alex Warth's input, who has a lot of experience implementing object systems.

The existing Node system is complex and "leaky". I haven't thoroughly checked this code but it seems like these are good patches. However, if you think the complexity of the system and all the patches it will need is overwhelming, I am amenable to a redesign of the system. But it could also be a fool's errand. Let me know if you have any ideas and/or we can discuss in Jan.

joshuahhh commented 8 years ago

This PR (which contains only diagnostic tools) is useful for further diagnostic work, so I'm gonna land it. Merry Christmas!