Virtual DOM implementation

rgrempel commented 6 years ago

I've been working away on master at finishing the Virtual DOM implementation, and it finally occurred to me that a PR might be nice.

rgrempel commented 6 years ago

There are a couple of interesting corners in Elm's virtual DOM which become apparent when you dig into it.

Efficient DOM traversal

Elm's Virtual DOM tries to be efficient in its traversal of the real DOM by keeping a count of the descendants of each virtual node, and tagging patches with a global index into that count. At least, I think I'm characterizing this correctly -- in any event, the goal is to be able to know about regions of the DOM which don't need to be traversed, since there are no patches to perform there.

I'm doing a similar thing in (what I think is) a simpler way. I'm tagging each patch with a List Int which represents a series of offsets from the rootNode into its children (and then that node's children, to whatever level the patch takes place at). Then, instead of iterating the DOM, I iterate the patches, and use the indexes to calculate the most efficient way to traverse the DOM. Essentially, I've got two possible starting points ... the root node (which I keep a reference to), or the previous node (which I know the List Int for). Depending on where I was and where I'm going, it may be more efficient to move from the root node down again, or it may be more efficient to move from the previous node to the next node.

The goal is to apply the patches in a reasonably efficient way, but avoid the book-keeping that the Elm code is forced to do. (It's fiddly book-keeping and I didn't relish the thought of implementing it).

Avoid removing & re-applying listeners

When Elm's Virtual DOM applies listeners for events, it creates the listener function in an interesting way. Basically, the listener function needs to apply the decoder to the event, stop propagation or prevent default if those are the options selected, and then pass the result of the decoder up the chain to any taggers etc. So, every listener is exactly the same shape ... only the decoder and the options vary.

The difficulty is that Elm will sometimes be re-applying the same decoder on every update, because determining that decoders are equal is a bit tricky -- sometimes, it works fine, but it's not fully decidable, so there are cases where Elm would be constantly removing and adding listeners. This would probably be a performance problem, and could even be a behaviour problem if it occurs at just the wrong moment.

So, what Elm does is attach the decoder and the options to the listener itself -- that is, to the function object. This way, you can change the decoder and options without removing and adding the listener. This is a really nice optimization, so I'm in the middle of implementing it.

Mapping events

In order to implement map, for mapping messages, Elm maintains a chain of "event nodes" that contain the mapping function, and a reference to their parent. At the top of the chain, the top event node's mapping function actually sends the message into the app's event loop.

These event nodes get attached to DOM nodes, mainly so that when traversing the DOM to apply patches you can get a reference to the "current" event node -- that is, the one which will be the parent if you add a new at a certain point in the DOM. But this again sets up a tricky bit of book-keeping ... one needs to keep the internal parent-child structure consistent with what's stored in the DOM.

So, instead of implementing that, I'm planning a different approach. Each listener would apply its decoder, and then, if successful, re-dispatch a custom message with the results. That would bubble up the DOM in the ordinary way -- we'd literally use the decoder to map the event and re-dispatch it as a message. Then, if there is an Html.map somewhere above it, we'd insert a listener there which listens for this custom message, applies its mapping function, and then re-dispatches the results. At the top, we'd have a listener that listens for the custom message and feeds it into the app's event loop.

I think this will be conceptually clearer, and it lets the message-passing facilities of the browser work for us. There will still be a minor bit of book-keeping to do, because our Html.map virtual nodes don't correspond to a distinct DOM node (they would merely add a listener to the result of rendering a node), so that will still need to be taken into account when traversing & redrawing etc. But I think it will be nice approach -- worth trying, anyway.

rgrempel commented 6 years ago

Hmm. There is a bit of a puzzle in how to diff two Nodes when taking taggers into account.

The problem starts with the fact that we don't have entirely reliable equality checks for taggers. That is, when you supply an Html.map (\a -> something a), we keep a reference to the function, but we won't necessarily get a stable reference to the same function on the next round -- we'll only get a stable reference if you provide a stable reference. So, even if the two functions are produced using the same code, and are fundamentally equal, we may or may not be able to detect that -- we'll get false negatives that we can't really avoid. (Well, we could document how to avoid false negatives, but it won't always be feasible).

Now, the easy way to deal with this would be just to redraw when we hit a tagger that we can't conclude is equal to the last round. However, this isn't what the Elm code does, probably because it would be a significant performance hit -- you'd be redrawing on every round once you hit a false negative, which would be nasty.

So, what the Elm code does is mutate the "tagger" in-place ... essentially, this minimizes the cost of the false negative -- and then continue on to diff the children.

It's possible to implement a similar strategy in Purescript ... we're already mutating event listeners in place for similar reasons. The difficulty comes when we continue on to diff the children. If we can't determine that the taggers are equal, then we don't have evidence that the children are of the same type.

I guess that means that we'd need a version of diff that doesn't insist on the nodes being the same type. I took a look to see what we'd lose in that case, and it's actually surprisingly little. Essentially, it would just limit our ability to determine the equality of decoders -- we wouldn't be able to assume that the decoders were the same type. However, it turns out that Elm's code for determining the equality of decodes does not rely on their being of the same type -- I had noticed this before, but didn't appreciate its significance. So, we'd get (slightly) more false negatives for the listeners (because our equalDecoders code would be less powerful), but we have a strategy to minimize the cost of that. In any event, that's the trade-off the Elm code makes, and it probably makes sense to start off with a similar trade-off.

In theory, there are a couple of possible alternatives. We could complicate the Node type to make it recursively track the types of subnodes, giving the type itself a sort of recursive "tree" structure -- so that we'd "remember" the way in which the node is constructed. Though, I suppose that wouldn't necessarily help, because we couldn't insist that each of the nodes given to diff has the same internal structure ... the whole point is that they may or may not. So, it's more of a run-time thing than a type thing. I suppose we might do something to retain run-time type information about the msg type ... it could be a kind of manual Typeable (which we'd also find useful in effects managers). (Or, a real Typeable, which I worked on for a while two years ago, but I don't want to go down that rabbit hole again right now). But, I'd prefer to stick closer to the Elm types, at least at first, for ease of translating Elm apps down the road.

So, I think I'll experiment for the moment with a less capable version of equalDecoders, and then work on a more sophisticated solution later.

rgrempel commented 6 years ago

I've implemented a version of equalDecoders that doesn't insist on the decoders being the same type. This turns out to have been a very good idea -- it actually makes map2 through map8 work better with equalDecoders (since they cover cases where you want to continue even if you can't match types). So, next task is to make use of that here.

rgrempel commented 6 years ago

Well, we now have event listeners and taggers!

All that remains is:

keyed nodes ... which will be an interesting chunk of work, but there's a lot we can test without them!
actually hooking this into the program, that is, feed the event msgs back into the event loop etc.

And, then some examples & testing, of course!

rgrempel commented 6 years ago

Well, there should be just enough implementation now to try a full program with event handling & everything. But I shall bask in the glory of imagined accomplishment for a bit before the debugging begins!

rgrempel commented 6 years ago

Well, this is working well enough that I can start translating honest-to-goodness Elm programs, and they work! It's not finished ... there are some optimizations I haven't implemented yet (like integrating with requestAnimationFrame), and the keyed nodes aren't implemented yet. However, it probably makes sense to merge this and work on those things later. And, surely some bugs will emerge, but not so far.

pselm / signals