ajvincent / es-membrane

An ECMAScript implementation of a Membrane, allowing users to dynamically hide, override, or extend objects in JavaScript with controlled effects on the original objects.
111 stars 13 forks source link

Single-use callback functions (filters, for example) can leak memory in a Membrane #141

Closed ajvincent closed 6 years ago

ajvincent commented 6 years ago

Example: https://developer.mozilla.org/en-US/docs/Web/API/document/createTreeWalker

let walker = document.createTreeWalker(root, NodeFilter.SHOW_ELEMENT, function(element) {
  // this is the dangerous part:  every node which is an element gets a proxy created...
  let classList = element.className.split(/\s+/g);
  return classList.includes("foo") ? NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_SKIP;
  // ... which the filter may store a reference to the element outside this function.
}, true);

while (walker.nextNode()) {
  // do something
}

Here, the intent of the API is that the filter is a "look once if you have to" pure function. The problem is, it creates a proxy object for each element that goes into the filter, which lives long beyond the filter invocation. In other words, the proxy effectively is a memory leak, until the element is no longer attached to the DOM - and then it's up to the garbage collector to find that and realize that.

There are no light-weight solutions I can see for this. A few solutions I have considered are:

const filter = originGraphObject.createObjectFilter();
{
  let element = filter.addArgument(); // returns proxy
  let className = element.className; // a lambda getter
  let classList = className.split(/\s+/g); // another lambda getter
  let ifCondition = filter.addCondition(classList.includes("foo"));
  filter.addReturn(ifCondition.true, NodeFilter.FILTER_ACCEPT);
  filter.addReturn(true, NodeFilter.FILTER_SKIP);
}

let walker = document.createTreeWalker(root, NodeFilter.SHOW_ELEMENT, filter, true);
while (walker.nextNode()) {
  // do something
}

Verifying purity might be really hard! In the example that started this issue, we refer to NodeFilter.FILTER_ACCEPT and NodeFilter.FILTER_SKIP. These are just numbers (1 and 3, respectively), but the pureness validation code must be aware of that.

The TreeWalker API does allow passing in an object with an acceptNode method, but it's a little unclear whether referencing "this" in a method violates purity in JavaScript. That becomes a problem because the filter object then could be referenced outside the TreeWalker... and that makes it unsafe for conversion.

const filter = document.createNodeFilter();
filter.acceptNode = function(element) {
  if (element.parentNode == filter.lastParent)
    return NodeFilter.FILTER_REJECT;
  filter.lastParent = element;
  return NodeFilter.FILTER_ACCEPT;
};

let walker = document.createTreeWalker(root, NodeFilter.SHOW_ELEMENT, filter, true);
while (walker.nextNode()) {
  // do something
}

Technically, the acceptNode method of the filter isn't "pure", because it refers to the filter object... but the filter comes from the same object graph as the element, so it should be safe. (See #65 for a similar idea.) Specifically, the filter here is a proxy to an unwrapped filter object, whose API the origin graph can safely define. Even if the origin graph provides a vanilla object ({}), the membrane takes care of all object wrapping and unwrapping, so again, it's safe.

If the user sets an unsafe property on the filter object, and the filter's acceptNode method relies on that property, then we're incurring the creation of additional proxies, coming back to the same problem. If the user sets a callback function that the acceptNode method calls, and the callback isn't pure, that brings back the original problem: forcing the membrane to create proxies that might never be used again.

erights commented 6 years ago

I don't understand the starting problem and scenario. Can you sketch a diagram showing the reference graph? Tom's graphical notation would be good here. Thanks.

ajvincent commented 6 years ago

My apologies - I've been too busy to do that diagram. Let me try again, in very short English:

The third argument to createTreeWalker is a callback function. Every time a DOM node meets the other criteria of the tree walker, a proxy for that node must be passed to the callback. Often that means the proxy must be created and maintained within memory until the DOM is dismissed. If the proxy is never referenced again by any code, that's effectively a memory leak.

The proxy can't be GC'd because:

  1. Other nodes in the DOM tree hold a reference to the underlying node
  2. The node holds a reference (by being a key in a WeakMap) to a ProxyMapping object
  3. The ProxyMapping object holds a reference (by object graph name) to the proxy.

Now, if the callback function could be verified as safe to run directly (that is, not wrapped) on the same object graph as the original DOM tree, then we wouldn't need to create all these proxies. Proxies, at least in my membrane's model, are created mostly as needed, so the less proxies created, the better. That's what this is all about.

ajvincent commented 6 years ago

I may be completely wrong about this logic, which is based on the following:

var a = new WeakMap();
var b = {};
var c = {};
a.set(b, c);
c = null;
// assume gc
a.get(b) // returns an object 

Why a.get(b) returns an object - and this is my assumption - is because:

If one of these two conditions for some reason becomes false, I believe the object which was earlier returned may be garbage-collected, if no other strong references to that object exist.

ajvincent commented 6 years ago

I've thought about this some more, and I've decided: this is a "first world problem", not a genuine one. If we create additional singleton proxies as a result of walking a tree, so be it.

mhofman commented 5 years ago

This is something that can be solved by WeakRef once they're available.

My un-informed take is that you could wrap the proxy (and possibly revoke) properties of the records in ProxyMapping.proxiedFields in a WeakRef.