funkia / turbine

Purely functional frontend framework for building web applications
MIT License
687 stars 27 forks source link

Why Turbine doesn't use virtual DOM #32

Open paldepind opened 7 years ago

paldepind commented 7 years ago

@dmitriz

Don't use virtual DOM. Instead, the created view listens directly to streams and updates the DOM accordingly. This avoids the overhead of virtual DOM and allows for a very natural style where streams are inserted directly into the view.

Could you explain this? I've thought the point of virtual DOM was to avoid the overhead of the real one :)

In my opinion, virtual DOM solves a problem that we can completely avoid by using FRP.

Note: In this post, I'm using the words "stream" as a catch-all term for what different libraries call "stream" or "observable" and for what Turbine calls a "behavior".

Frameworks that use virtual DOM almost always represents their views as a function that takes plain state and returns virtual DOM. For instance, the render method on a React component is a function that returns a vnode tree from the components state. This is nice because it means that our view doesn't need to know which parts of our state changed and figure out what changes it then has to make to the DOM. It can just take the entire state and build a new virtual DOM. Then the virtual DOM library will find the differences and figure out how to precisely update the DOM.

When we use FRP the situation is quite different. We represent our state as streams. We can observe them and precisely know when pieces of our state changes. Based on this we can know exactly what changes to make to the DOM. We don't need a virtual DOM library to figure it out.

What I see almost all FRP/observable based frameworks do instead is something like the following: They start out with a bunch of different streams in their model, they then combine those into a single state stream and finally map their view function over the stream. So they go from state in many streams to state in a single stream to a stream of vnodes.

One very minor immediate downside to this approach is that the user has to write a bit of boilerplate code to merge the streams. Flimflam cleverly eliminates this boilerplate by automatically merging streams from an object.

The much bigger issue to the approach is that it throws information away that it then later has to get back by using virtual DOM diffing. Not only that, throwing the information away has a small overhead (the merging) and getting it back has an even bigger overhead (virtual DOM diffing).

To understand the problem let me give an example. Assume we have a model with three values, A, B and C. We express all of these as streams. We want a view with three p tags that each show one of the values.

<p>value of A here</p>
<p>value of B here</p>
<p>value of C here</p>

Whenever one of the streams changes we will have to modify the content of the p tag that it belongs to. What the above mentioned approach will do is something like the following.

const stateStream = combine(aStream, bStream, cStream);
const vnodeStream = stateStream.map(viewFunction);
// apply virtual DOM diffing to vnodeStream 

Note that when we have three streams we can observe each one of them individually and know when they change. But, when we combine them into a single stream we can only observe when any of them changes and don't know which one.

Therefore, when we apply our view function to the combined stream, it too doesn't know which value changed and it will calculate an entirely new vnode tree. Then the virtual DOM library will compare the new vnode tree with the last one it saw. It will then figure out if it was A, B or C that changed and update the view accordingly.

The virtual DOM library goes through the entire diffing process to figure out whether A, B, or C changed. But, we had that information originally before we combined the streams! If we hadn't merged the streams in the first place we wouldn't have needed that at all.

A smart enough view could simply know where each stream belongs in the DOM, observe that stream and update the DOM whenever that one changes. That is what Turbine does. In the above example, all three streams would be given to the view function and they would be inserted directly into the data structure describing the view. Then when the view is created Turbine will do something equivalent to the following

aStream.observe((a) => update first p element);
bStream.observe((a) => update second p element);
cStream.observe((a) => update third p element);

This updates the DOM just as efficiently as what a virtual DOM library would do. But it eliminates the overhead of virtual DOM diffing completely. And virtual DOM diffing does have an overhead. As an indication of that, consider React Fiber. It's a very complex solution to solve, among other things, the problem that virtual DOM diffing can in some cases be so expensive that it takes more than 16ms and causes frames to drop.

We don't have any benchmark yet. But we think the approach Turbine takes can give very good performance. In particular in cases where virtual DOM is ill-suited. I.e. animations and similar things that have to update frequently.

For instance, if you want to implement drag-and-drop in Turbine. We can represent the position of the mouse as a stream and hook that stream directly into the dragged element. Then whenever the mouse moves we will update the element position very efficiently.

To sum it up. Turbine doesn't use virtual DOM because it is unnecessary thanks to FRP. And since not using it leads to a concise way of expressing views where streams do not have to be combined but where they instead can be inserted directly into the view.

Please let me know if the above explanation makes sense and what you think about the rationale.

dmitriz commented 7 years ago

@paldepind

Thank you for taking your time to answer this.

I find it very well written and making a clear point that deserves to be published as blog. People will surely be interested how React is still far from being efficient. ;)

I have actually recently encountered a non-perf-related issue of using the virtual dom when trying to make the un to mount a component onto a virtual element instead of the real dom element. It did not work, surprisingly, with Mithril's renderer, and I understand, it would work with snabbdom that I'd like to support in the future, but I am currently using the Mitrhil's flavour for the silly reason of the shorter api :)

Even though it is a different problem than you mention, it was annoying to discover. And I really like the Turbine's design feature that you can mount your component by simply inlining them in your view. That is a feature I'd like to achieve in un, which can be perhaps seen as very much more simple minded and less powerful version of the Turbine, designwise, and, of course, at the much earlier stage. But my vision is to enable the ability to plug in the Turbine's element factory and component rendering, when more power is needed.

So coming back to the virtual vs real dom, I can see exactly how you can improve the performance by localising the data streams and that is, sorry to bring it up again, also the design feature where un deviates from Redux with its single global store model.

But then, why not use the best of both worlds by trying to remove the boundary between the virtual and the real dom? And why should the consumer even have to think about which one is where. Let me just deal with that one thing called the "element", and it is up the low level piping framework to figure which one is to be used when and where. That would be my idea of a framework :)

This is of course not so much of an issue when you have a single monolith framework, where you only see the one real root dom element. But it becomes the problem if you want to embed your "foreign" components into another framework using virtual dom, and as library writer, I don't really have a choice.

So if I understand your correctly, the localised reactive approach would improve the performance for both virtual and real dom, where the actual distinction feels more like the implementation detail, that is the matter of the layer between the dom and your api unaware of which dom is used.

Thanks again for your explanation, and I'd be curious to know what you think.

jayrbolton commented 7 years ago

:moneybag: I'm sold.

hville commented 7 years ago

FWIW, @adamhaile uses a similar approach in surplus by tying FRP signals directly to DOM nodes. Benchmarks show exceptional performance.

paldepind commented 7 years ago

@dmitriz

I find it very well written and making a clear point that deserves to be published as blog.

Thanks a lot for the encouragement. Once we have the benchmark to prove the merit of our approach I'd definitely like to turn the above into a proper blogpost.

So if I understand your correctly, the localised reactive approach would improve the performance for both virtual and real dom, where the actual distinction feels more like the implementation detail, that is the matter of the layer between the dom and your api unaware of which dom is used.

I don't think so. The approach I describe doesn't work at all when using a virtual DOM.

@hville

Interesting. Thank you for sharing. The link to the benchmark repository will definitely be useful for us. Would you happen to know of similar projects that compare frameworks like that?

dmitriz commented 7 years ago

@paldepind

I find it very well written and making a clear point that deserves to be published as blog.

Thanks a lot for the encouragement. Once we have the benchmark to prove the merit of our approach I'd definitely like to turn the above into a proper blogpost.

I personally find your theoretical arguments here more convincing than any benchmark :) A benchmark is never objective and can only run for a specific situation, which may not be how another user would run it. Plus the current engine, plus the browser, desktop or mobile, ... Too many variables.

Or simply bad architecture. For instance, people complained about Angular's performance when running ng-repeat over 2K+ entries. But why would you ever need to iterate over that many entries at the cost of your user's browser? Even to show more than 10 at once to the user, may not be a great UX. In that use case, how would you run a meaningful benchmark?

Of course, nothing against a nice benchmark demonstration as add-on, but I would not spend much time on it, as people may always conclude their case is different.

So if I understand your correctly, the localised reactive approach would improve the performance for both virtual and real dom, where the actual distinction feels more like the implementation detail, that is the matter of the layer between the dom and your api unaware of which dom is used.

I don't think so. The approach I describe doesn't work at all when using a virtual DOM.

If I understand your point correctly, with RFP you have a fine grained control of what elements have changed. If you pass that information to your VDOM builder that would have the map between the VDOM and DOM, then the builder would know precisely which DOM elements needs to be updated. This is what I mean.

But if the VDOM builder does not store any such map with DOM, and treats the whole new VDOM tree as a new raw string doing the full diffing on every tiny refresh, then yes, it wouldn't be useful. But is the React renderer really so inefficient?

donaldpipowitch commented 7 years ago

Without a VDOM it is not possible to do something like like different renderers (for native, WebGL, Canvas...) or am I wrong?

limemloh commented 7 years ago

@donaldpipowitch I see no reason why Turbine shouldn't be able to achieve the same, as a matter a fact, I have already done some experimenting with Turbine for developing mobile apps a while ago with great success. A Turbine Component doesn't actually have to be tied to DOM. To make a library with Turbine components for building an app in canvas should be possible as well.

trusktr commented 7 years ago

We need a virtual DOM library to figure it out.

Missing don't in OP

trusktr commented 7 years ago

@dmitriz

If you pass that information to your VDOM builder that would have the map between the VDOM and DOM, then the builder would know precisely which DOM elements needs to be updated. This is what I mean.

But there's still overhead in just creating that vdom for no good reason if it isn't for diffing. With Turbine's approach there's no overhead of even creating any vdom at all.

trusktr commented 7 years ago

@donaldpipowitch

To make a library with Turbine components for building an app in canvas should be possible as well.

One way to easily do this right now, while using Turbine as-is, is to just make Custom Elements that render to canvas, then we can manipulate them with Turbine.

f.e.:

element('x-transformable')({transform: [1,0,0,0,1,0,0,0,1]}, [
  element('x-line')({start: [1,2], end: [3,4]})
])
leeoniya commented 7 years ago

virtual dom is simply a means of caching prior dom state in a cheap structure so that the needed dom mutations can be computed and applied; no level of functional purity will mask the fact that the dom itself is mutable api that has state which is expensive to query and expensive to update.

whether a lib rebuilds a full vtree or uses another way of caching prior dom state elsewhere is just semantics, imo.

dmitriz commented 7 years ago

@donaldpipowitch

But there's still overhead in just creating that vdom for no good reason if it isn't for diffing. With Turbine's approach there's no overhead of even creating any vdom at all.

The reason to have the vdom layer for me is to decouple the logic from the implementation details of the DOM. And to make my components more universally reusable for a larger variety of display interfaces other than the DOM.

trusktr commented 7 years ago

@leeoniya

no level of functional purity will mask the fact that the dom itself is mutable api that has state which is expensive to query and expensive to update.

Yeah, but it still needs to be updated regardless if it is from vdom or from FRP. So those equal factors aside, this FRP approach is lighter because there's no vdom, just direct modification, which is what the vdom is ultimately going to do to the DOM too.

whether a lib rebuilds a full vtree or uses another way of caching prior dom state elsewhere is just semantics, imo.

Only if you don't care about the performance differences.

@dmitriz

And to make my components more universally reusable for a larger variety of display interfaces other than the DOM.

I would think that there's an FRP way to map components to any underlying interface.

leeoniya commented 7 years ago

So those equal factors aside, this FRP approach is lighter because there's no vdom

again, "vdom" is just a name for a cache. unless you're diffing the real dom (which is slow), any FRP solution will also have this cache either in lib-space or app-space. nothing stops vdom impls from using observables, streams, immutability, reactivity or other functional principles to control the granularity with which the vdom is diffed, regenerated and the dom mutated.

Only if you don't care about the performance differences.

the overhead of generating vdom structures can vary from very high to very low [1].

EDIT:

i guess it really depends on the template format. if your templates are runtime-declarative (hyperscript), then diffing is inevitable, but if they're parse-time declarative, and can be compiled down to imperative state transitions (FSM) from observables, then no diffing is needed. perhaps that's where the disconnect is. i'm too stupid to assess whether such a templating language would be too restrictive (vs full js), overly DSL-esque (angular v1++) or the compiler would be very complex.

i'd be delighted to see an FRP solution without cache/diffing that's pure FSM -> dom manip that blows away all vdom solutions and still provides a non-grotesque DX while minifying to anything comparable to vdom solutions in non-trivial apps.

[1] https://rawgit.com/krausest/js-framework-benchmark/master/webdriver-ts/table.html

trusktr commented 7 years ago

again, "vdom" is just a name for a cache.

vdom isn't just a cache, it is specifically a tree that represents the DOM structure that you wish to construct or to update DOM to match to.

unless you're diffing the real dom (which is slow), any FRP solution will also have this cache either in lib-space or app-space.

A cache? Sure. A vdom? Not necessarily. "Cache" is not strictly equal to "vdom".

@paldepind I can easily imagine how mapping streams to element attributes works, but I'm not sure I can imagine how dynamically generated arrays of children work without vdom yet. Can you describe how a Turbine component figures out how to add/remove DOM nodes based on dynamic arrays of Turbine component children?

For example, there's list() which is used for this, but how does that work considering that the components in the list can have deep tree structures? What's in place of vdom for this? (vdom starts to seem like merely the easy way of making it work internally)

paldepind commented 7 years ago

First of all. I think this discussion is very interesting. But, let me just say one thing: performance is not the primary goal of Turbine. The primary goal is to create a purely functional framework that is really nice to use. That I think our approach can be made efficient is just a nice side-effect.

But, although performance is not a primary goal, it's still an important goal because it helps us reach the primary goal. A low-cost abstraction is usually a better abstraction. It is an abstraction that one never has to "break out of" to get good performance.

For instance, we want to make it possible to express animations with continous FRP behaviors. In this case, it's important that one can update attributes at each frame without DOM diffing and without allocating garbage (which a vdom does).

paldepind commented 7 years ago

@trusktr

Thank you for the correction. A crucial word to include!

Yeah, but it still needs to be updated regardless if it is from vdom or from FRP. So those equal factors aside, this FRP approach is lighter because there's no vdom, just direct modification, which is what the vdom is ultimately going to do to the DOM too.

Yes. This is spot on. The necessary amount of work we have to do to the DOM is constant. The question is how much work we do to figure out what the necessary amount is.

@leeoniya

Thank you for chiming in. I think describing vdom as a cache is accurate. But it's not just a cache. The cache has a purpose and that purpose is diffing. What if you don't need diffing? Then you don't need a cache either.

nothing stops vdom impls from using observables, streams, immutability, reactivity or other functional principles to control the granularity with which the vdom is diffed, regenerated and the dom mutated.

What Turbine does is make this "granularity" so fine that diffing the vdom isn't necessary. And then the need for having a vdom in the first place goes away.

i guess it really depends on the template format. if your templates are runtime-declarative (hyperscript), then diffing is inevitable, but if they're parse-time declarative, and can be compiled down to imperative state transitions (FSM) from observables, then no diffing is needed. perhaps that's where the disconnect is. i'm too stupid to assess whether such a templating language would be too restrictive (vs full js), overly DSL-esque (angular v1++) or the compiler would be very complex.

Yes. I think this is where the disconnect is :smile: We are not compiling templates though. They are written in plain JS. IMO our templates are a natural extension of the FRP abstractions.

i'd be delighted to see an FRP solution without cache/diffing that's pure FSM -> dom manip that blows away all vdom solutions and still provides a non-grotesque DX while minifying to anything comparable to vdom solutions in non-trivial apps.

I don't understand how we'd do it with a finite state machine. I guess if you look at it abstractly you could think of it as such. But our implementation doesn't look like an FSM at all. Our templates take behaviors/observables as input.

paldepind commented 7 years ago

@trusktr

I can easily imagine how mapping streams to element attributes works, but I'm not sure I can imagine how dynamically generated arrays of children work without vdom yet. Can you describe how a Turbine component figures out how to add/remove DOM nodes based on dynamic arrays of Turbine component children?

For example, there's list() which is used for this, but how does that work considering that the components in the list can have deep tree structures? What's in place of vdom for this? (vdom starts to seem like merely the easy way of making it work internally)

That is a very good question. I asked myself the same thing when I had to implement the first dynamic list in Turbine :rofl:. You are completely right that it's easier to imagine how it works with attributes.

But the answer is that in this case, we do the very same thing that a virtual DOM library would do. In fact what I think any framework would have to do. We do what is called children reconciliation. Let me try and explain.

First, it's important to point out that virtual DOM libraries will only efficiently reorder a list of elements if the user supplies some sort of key property for each vdom node. For instance, if we diff this.

<ul><li>b</li><li>c</li></ul>

With this

<ul><li>a</li><li>b</li><li>c</li></ul>

One may think that a vdom library would realize that in this case it just has to insert an element in the beginning of the list. But I don't know of any vdom library that actually does that. React or Snabbdom will do the following naive set of modifications: 1/ modify the text in the first li from "b" to "a", modify the text in the second li from "c" to "b". 3/ insert a new li elment with text "c". Obviously, that is not the most efficient way to update the DOM.

To solve the problem users have to give their vdom library unique keys on each element. When keys are present the vdom library will essentially stop doing vdom diffing and instead do "key diffing". It will look at two arrays of keys and figure out what to reorder based on those keys. The list function in Turbine does the very same thing.

I think it's interesting to consider how poor virtual DOM libraries actually are at diffing. They pretty much only find the minimal amount of DOM manipulating when one is changing simple attributes and in the key-diffing case. But in practice, it works out just fine because that actually includes most of the changes that one makes to the DOM. Doing a more comprehensive diffing isn't worth it because such "deep-diffing" algorithms are very costly.

The cases where a vdom library is actually efficient with the DOM can both neatly be expressed in Turbine using behaviors as attributes and the list function (but do note that we haven't implemented a proper key-diffing algorithm in list yet).

dmitriz commented 7 years ago

@paldepind Thanks for explaining it so well.

It might be interesting to compare with Inferno that apparently "never needs to "diff" virtual DOM": https://github.com/infernojs/inferno/issues/21#issuecomment-165734790

Also the treatment of the key attribute is different and more permissive: https://github.com/infernojs/inferno/issues/406#issuecomment-249705534

Their vdom seems to keep the internal references to the dom nodes even without the key attribute.

I haven't read their code so cannot verify the details, that is just my own interpretation of what the Inferno creator says.

paldepind commented 7 years ago

@dmitriz You're very welcome. Inferno is very cool. I don't know much about it. But, my understanding is that they compile JSX so that they know exactly where data is being inserted. Then that can diff data instead of diffing DOM.

In Turbine we don't even need to diff data since behaviors tells us exactly when they change. Also the Turbine approach doesn't rely on JSX and compilation.

trusktr commented 7 years ago

Yeah, thanks for that explanation! 👌 So diffing of some sort is naturally required, just that it is minimally exclusive to arrays of children. If not using key diffing, does Turbine end up doing something similar to React (shifting content around and only appending new elements) or does it do something else?

Also the Turbine approach doesn't rely on JSX and compilation.

That would be an interesting option to have, though I can live without it.

dmitriz commented 7 years ago

@paldepind

@dmitriz You're very welcome. Inferno is very cool. I don't know much about it. But, my understanding is that they compile JSX so that they know exactly where data is being inserted. Then that can diff data instead of diffing DOM.

Yes, they seem to directly reference the DOM nodes from their vnodes. So when a vnode changes, they know exactly where to patch without diffing. They also provide their inferno-hyperscript, so no need for the JSX part :)

In a way, it might not be that different from Turbine. As every component is composed from the element functions, the component tree plays the role of a vdom tree, and both run functions put direct references from the components/nodes to their dom elements.