krausest / js-framework-benchmark

A comparison of the performance of a few popular javascript frameworks
https://krausest.github.io/js-framework-benchmark/
Apache License 2.0
6.76k stars 836 forks source link

should there be a metric that forces frameworks to do generalized list reconciliation? #1198

Open leeoniya opened 1 year ago

leeoniya commented 1 year ago

allow me to explain...

back in the day, when this benchmark only compared vdom-based frameworks, the "swap rows" metric was there specifically to trigger a pathological case that messed up inefficient list reconciliation, for example when diff algorithms only did rudimentary list prefix/suffix testing and fell back to mutating everything in the middle. this is why growing the swap distance in 2017 had a major impact on the standings.

what we have today (i think) with the addition of fine-grained reactivity is a side-stepping of this intent, where the frameworks' store or proxy primitives directly map imperative mutations, bypassing the need to actually diff the whole list for changes, effectively behaving more like jQuery plus a thin layer of data<>dom binding in between. one such example:

https://github.com/krausest/js-framework-benchmark/blob/8dbeab1a814b9b3f714d7be245f6f42d3ee8d82c/frameworks/keyed/xania/src/table-store.ts#L126-L128

https://github.com/xania/view/blob/8bb5eb49641906710b0eb9c669072a8105986b64/packages/view/lib/core/list/list-source.ts#L188

remember when mikado had a dedicated swap() method? https://github.com/krausest/js-framework-benchmark/pull/654

in 99% of real applications, you're going to have a database and a backend with a json API that returns a list of objects or structure that cannot be cheaply mapped by referential integrity as you can do when everything just lives in JS memory. imagine an API call to the backend that returns a list where 80% of objects have the same id, 10% are missing, 5% are highlighted and 5% are re-ordered or new. you now need to run a generalized list diff algo to properly map and mutate the correct objects by some key/id. React and vdom frameworks handle this generically, and this is a case when you cannot directly write the pre-defined mutation logic into the app, it has to be inferred from an in-memory list and a new list of items from a `fetch('/list/query').then(r => r.json()) api call.

i'm not sure how many frameworks here do this imperative swapping, it's certainly not all. but maybe we need to have a metric that measures the above common case where frameworks are forced to do full reconciliation of an unpredictable/complex set of mutations based on re-ordering of keys, removal of keys, updating of items, and insertion of new items from a plain js array that is received over fetch/json API?

thoughts? @krausest @ryansolid @fabiospampinato

fabiospampinato commented 1 year ago

Maybe a flag could be added when a framework just updates the values of the rows, skipping diffing the rows? There might already be a check for this, I'm not sure.

FWIW both Voby and Solid explicitly create a new list, that will be diffed at some point, which is actually cheaper than paying the price for making ids reactive also.

We are relying on referential equality, which speeds things up, but basically the same should happen in React, the difference is that the "key" is explicit in React, but it's implicit in Solid/Voby. A more costly kind of reconciliation of children basically doesn't exist in Solid/Voby, it's just that plus re-executing a bunch of computations when needed.

If you want to force frameworks to skip this optimization you can basically just append a row to the table, which is already tested for.

Potentially there's another case to consider, where row A is transformed into row B, where keys shouldn't match but everything is still reused, which is maybe what you are alluding to. I think in order to do this in React one has to force keys to match anyway? Which maybe isn't always possible. In Solid/Voby this is possible also, you can basically get this behavior by swapping the For component for the ForValue component. This component isn't implemented in Solid, but it could be.

I don't know if adding a test for this would be practical, I guess the non-keyed part of the benchmark maybe allows for this already? In any case the reactive framework should probably still have the advantage here.

leeoniya commented 1 year ago

would you mind showing me what a Voby version of this would look like if you cannot touch the fetch functions which return JSON strings? in my view, this is what swap was really designed to test.

even if this can be accomodated, how will it behave for deeper/more complex data structures? how do those nested/deep observables get re-constituted?

https://codesandbox.io/s/interesting-kowalevski-qpmq39?file=/src/App.js

import { useState } from "react";

function fetchList() {
  return JSON.stringify(
    ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"].map((key) => ({
      key,
      text: key
    }))
  );
}

function fetchListNew() {
  return JSON.stringify(
    ["c", "b", "d", "i", "f", "g", "z", "a", "j"].map((key) => ({
      key,
      text: key === "b" || key === "a" ? `${key}2` : key
    }))
  );
}

export default function App() {
  let selectedKey = "i";

  let [list, setList] = useState(() => {
    return JSON.parse(fetchList());
  });

  let updateList = () => {
    setList(JSON.parse(fetchListNew()));
  };

  return (
    <>
      <button type="button" onClick={updateList}>
        Fetch updated list!
      </button>
      <ul>
        {list.map((obj) => (
          <li
            key={obj.key}
            className={obj.key === selectedKey ? "selected" : ""}
          >
            {obj.text}
          </li>
        ))}
      </ul>
    </>
  );
}
localvoid commented 1 year ago

even if this can be accomodated, how will it behave for deeper/more complex data structures?

Since they are using a lossy reactive graph (signals lose all information about data mutations), diffing is the only solution to implement incremental updates, and yet they are marketing itself as "no diffing" solutions :) In the worst case scenarios they are going to perform more diffing than solutions that use diffing algorithms at UI layer.

Haven't looked at Voby, but usually they are providing some primitives for diffing data, or it can be implemented with a lot of memoized computations. E.g.

function TableRow(state: Reactive<TableItemState>) {
  const id = computed(() => state().id);
  return (
    <tr class={computed(() => state().active ? "TableRow active" : "TableRow")} id={id}>
      <TableCell data={computed(() => "#" + id()) />
      <ListTrackByKeyAndWrapIntoReactiveValue data={computed(() => $(state).props)} keyFn={getCellId} render={TableCell} />
    </tr>
  );
}

In the example above, <TableRow> will receive a value wrapped into a reactive primitive that was created by diffing a list of TableItemState entries. Then it creates a lot of "lenses" that diff each value individually. It creates a some sort of a diffing pipeline, so it is possible to write from-scratch algos when working with complex data structures. Obviously it will be way much easier and more efficient just to diff data, but it is definitely possible to encapsulate this diffing logic in "components".

But what actually happens in practice is that a lot of inexperienced developers end up writing code that recreates entire DOM subtree on every change.

But to be fair, f(state) => UI libraries that promote ideas that everything should be optimized with Immutable Data Structures ends up with even worse solutions, since it is no longer a simple f(state) => UI from-scratch algo and we need to "incrementally update" immutable data structures so that we could share immutable values.

There is just no one-size-fits-all solutions in this problem space and pretty much every complex component requires specialized algorithms, like if I am going to build a spreadsheet component, the algorithms that are usually used for "signals" will be quite expensive, or if I am going to build a rich text editor component, it would be better with a highly tuned for RTE constraints diffing algo.

From my point of view, comparing f(state) => UI libraries with libraries like Voby in the same benchmark just doesn't make any sense, completely different tradeoffs.

fabiospampinato commented 1 year ago

would you mind showing me what a Voby version of this would look like if you cannot touch the fetch functions which return JSON strings? in my view, this is what swap was really designed to test.

@leeoniya Two possible implementations here. There's kind of a lot to say about this.

So the situation is fairly complicated. Lots of other details to consider also:

even if this can be accomodated, how will it behave for deeper/more complex data structures?

It would work basically the same way, granular updates with a store (which has many use cases unrelated to diffing also, by the way), and a reconcile function, preferably with special handling for these "key" properties if you are dealing with a lot of data with these "key"s.

Since they are using a lossy reactive graph (signals lose all information about data mutations), diffing is the only solution to implement incremental updates, and yet they are marketing itself as "no diffing" solutions :)

Well if the problem to solve must require deep diffing you must do deep diffing or you just can't solve the problem, but this requirement for deep diffing doesn't exist inside the framework itself, which is the crucial point here.

Like this kind of diffing is just how React works, you can't separate the two. But you can just not do this kind of diffing in Voby, or Solid. This $.store.reconcile function is provided under the same package just for convenience, but it could just be deleted and the framework would stay usable, you can't delete this deep diffing from something like React.

"Proof in case" my reconcile function was actually buggy with top-level arrays. In the app I work on, where the interface is written in Voby, I call $.store.reconcile a grand total of zero times, because it's useless for my use case.

But what actually happens in practice is that a lot of inexperienced developers end up writing code that recreates entire DOM subtree on every change.

I've said this a few times, I think most people don't generally agree with it, but React is basically optimized to execute garbage code, and things like Solid or Voby are optimized to execute good code. If the code to execute is garbage it's not hard to shoot yourself in the foot with reactivity, imo. It's much harder to completely mess up performance in React, like the ceiling of performance is lower, but to their credit the floor is higher.

I try to put the care needed to not write garbage code so I care more about having a higher ceiling than a higher floor.

There is just no one-size-fits-all solutions

Sure. Other people may care more about a higher floor, or they may have to deal with immutable data, which is more naturally suited for a React-style approach, or something else.

if I am going to build a spreadsheet component, the algorithms that are usually used for "signals" will be quite expensive

I don't know about that, it'd be interesting to benchmark it. Update performance is usually where signals shine, but of course if the data you are working with is kind of adversarial that's not the best case scenario. Immutability and signal-based reactivity just don't go well together, it's a usable combination, but not optimal. Symmetrically mutability and React-style reactivity don't go well together either.

fabiospampinato commented 1 year ago

It'd be interesting to have a "Create from JSON" test though. For this and other use cases, since the benchmark would be able to provide the framework with many different kinds of JSON strings it would also be easy to extend it to a large degree without requiring changes in the individual implementations.

localvoid commented 1 year ago

but this requirement for deep diffing doesn't exist inside the framework itself, which is the crucial point here.

It doesn't exist when working with vanilla DOM API either. Voby/Solid just maps reactive values onto DOM properties with an extra bonus that it uses list diffing and can handle basic list mutations, but it doesn't provide an easy to use general-purpose solution for incremental DOM updates, it would've worked if DOM tree were stateless. It is almost the same as writing incremental algorithms with vanilla that mutates DOM tree directly, but instead of mutating DOM nodes, we just mutate a reactive tree that is mapped onto DOM tree. If it is enough for your applications, I believe you, but there are a lot of use cases like data binning for rendering charts, or even more complex data transformations that is way much easier to solve with diffing.

fabiospampinato commented 1 year ago

It doesn't exist when working with vanilla DOM API either. Voby/Solid just maps reactive values onto DOM properties with an extra bonus that it uses list diffing and can handle basic list mutations, but it doesn't provide an easy to use general-purpose solution for incremental DOM updates

IMO that's a little dismissive, these libraries are way more convenient to use than writing some vanilla JS, they allow you to write your code in a declarative manner that's impossible to write with just basic vanilla code (without writing your own reactivity system that is), with great performance out of the box, especially when signals are fully embraced, and with some kind of amazing optimizations that are applied for you for free (Like try doing what Solid's transform does by hand, your code would become unreadable, or try implementing ForValue, or Voby's template function, this stuff isn't easy to get right or efficient even if you try).

If it is enough for your applications, I believe you, but there are a lot of use cases like data binning for rendering charts, or even more complex data transformations that is way much easier to solve with diffing.

Sure, I guess. I mean I explicitly say in Voby's readme that I basically don't care for most use cases, like building websites, SSR, hydration etc. I surely don't think Voby is the best option for everything. It's the best option for me.

leeoniya commented 1 year ago

Sure, I guess. I mean I explicitly say in Voby's readme that I basically don't care for most use cases, like building websites, SSR, hydration etc. I surely don't think Voby is the best option for everything. It's the best option for me.

even so, fetching JSON data over io, deserializing it and updating the UI from that is so fundamental to any app worth building in JS that omitting it from a benchmark about JS framework performance is :exploding_head:. it's not some edge case -- it's the norm.

localvoid commented 1 year ago

they allow you to write your code in a declarative manner

Depends on what you mean by that. With Voby/Solid I can't just take an app state that is implemented with signal primitives and write something similar to a "GROUP BY" sql statement in UI components, unless we don't care about internal DOM state and it is ok to reinstantiate or map data onto different DOM nodes.

fabiospampinato commented 1 year ago

Sure, I guess. I mean I explicitly say in Voby's readme that I basically don't care for most use cases, like building websites, SSR, hydration etc. I surely don't think Voby is the best option for everything. It's the best option for me.

even so, fetching JSON data over io, deserializing, it and updating the UI from that is so fundamental to any app worth building in JS that omitting it from a benchmark about JS framework performance is 🤯. it's not some edge case -- it's the norm.

What's the best JS app you can think of? VSCode maybe? Do they need this? If they don't: how is that possible given how supposedly fundamental this is? If the argument is that VSCode shouldn't have been written in JS to begin with then I question this argument given how VSCode written in JS basically destroyed Sublime, written in super optimized C++ and a custom UI toolkit.

And to be clear it's not like this is unusable or extremely slow in Voby or Solid, if it is it's a bug that should be fixed. They might still be faster than React at this workload for all I know. I mean it's not like there's some huge overhead here if you write correct code, but that correct code is harder to write (assuming one doesn't have to deal with the awkwardness of hooks), in this case, sure.

they allow you to write your code in a declarative manner

Depends on what you mean by that. With Voby/Solid I can't just take an app state that is implemented with signal primitives and write something similar to a "GROUP BY" sql statement in UI components, unless we don't care about internal DOM state and it is ok to reinstantiate or map data onto different DOM nodes.

I'm not sure what you mean exactly, but if reusing DOM nodes is what you are after that should be something Voby/Solid, and probably other signal-based libraries, should be good at.

If you have an actual little benchmark at hand I'd be happy to make an implementation for Voby so that we can take some actual measurements and see what the code looks like.

ryansolid commented 1 year ago

The avoiding diffing thing via direct change I think is only in a couple of libraries. Mikado, Xania, maybe Sinuous.. I think there was one other. For the most part the reactive libraries do a shallow list diff here. Things like Swap, Append, Remove are no different than their VDOM counterparts. What is different is nested updates like Update Every 10th Row as they don't re-run list reconciliation but use a different means to apply updates. Mind you I'm not sure that will always be the case. There has been more research into passing along change information with the Signals so diffing could be skipped even with more data-driven approaches.

Which brings me to my biggest gripe with the Mikado example wasn't that it did array operations but that it did so by doing a 1 for 1 proxy. Basically the benefit fell away as soon as the data served multiple things. Like you couldn't have a data store that you were showing in two locations, each would have their own proxy that you were mutating. At that point data isn't derivable, there is no flow, might as well be moving the DOM elements yourself.

Admittedly this is an ergonomics/abstraction question more than anything else. Because that is something perfectly valid to do. The Template became the data. Any reservations I had about this is purely from the perspective of a desire to keep those less coupled.

Now my perspective on Swap Rows is not to test data snapshots. I picture something like a drag and drop. Even like a header getting clicked to re-sort the rows. Some sort of manual sort operation that involves changing the list. I'm not going to my database to swap rows. Historically a ton of benchmarks have been around these data snapshots, everything from UIBench to DBMon. Deep diffing is the only way you will handle deep data from the server efficiently. I doubt we will get that in this benchmark mind you. Because as soon as a test like this appears the solution changes to a shallow keyed-based diff(we have a component for that) and not much changes.

If you want to show this off you need data deeper than this benchmark. At this point the recommendation for reactive libraries are similar to Solid's createStore + reconcile. At that point you diff on set. Works pretty good. That's the approach I used in those other benchmarks. You know where the data is coming from and you diff only for those specific cases leaving the JSX, all the other setters etc, untouched. Because the approach here even with data stores is that you don't want to lose information on mutation. Mutate as close to the change. Failing that diff. But you know you need to at that call site, because it's the one setting the blob instead of just sending in the specific changes. So you can maintain granularity of change for all other cases.

ClassicOldSong commented 1 year ago

One opinion: forcing a whole list regeneration in swap test will result in the need to implement a new test which tests moving nodes/node trees around. Cases like draggable programming like Scratch is a very good example for this kind of use case.

xania commented 1 year ago

@leeoniya I agree on the notion of moving the ownership of the data in the bench to an external source (file, json string, callback method like SCORM does it). It would be interesting to see how xania would perform with a generalized list diffing (in the swap case).

However, incremental update is not some edge case. The fact that only xania and mikado (and maybe Sinuous) have this approach is because they can and the benchmark allows for it. Personally I don't like that other frameworks force you into a generalized list diffing. If I know I am appding or even inserting new row then I don't want the UI framework to fallback into unnecessary diffing when there is no need for that. Therefore xania provides a number of ListMutation(s) that you can dispatch to perform the corresponding update on the UI. I like to view this as a declarative approach, it only seems like imperative code because of the click events are inevidable imperative starting point. Currently I am looking into cycle.js to see if we can adapt there approach to have fully declarative / reactive pipeline, from event to updating the dom.

but again, a UI library should be able to handle - in my opinion the less common - use case where the data is externally updated after initial load.

leeoniya commented 1 year ago

in my opinion the less common

e.g. at Grafana this is the norm, where we get saved dashboards/schemas through JSON stored in a sqlite DB and then follow-up data fetches which can result in "repeat" of various panels from different datasource endpoints/plugins.

Personally I don't like that other frameworks force you into a generalized list diffing. If I know I am appding or even inserting new row then I don't want the UI framework to fallback into unnecessary diffing when there is no need for that.

i think these imperative-type APIs are mostly crutches that easily devolve into jQuery-type code, which defeats much of the purpose of the framework. @localvoid's work and results with ivi v3.0 prove (imo) that all that extra API surface and complexity only needs to exist if the framework itself is incapable of doing this efficiently. also, with simple/explicit sCU, ivi can actually go even faster: https://github.com/localvoid/ivi/issues/54#issuecomment-1500902768. but the current code is beautifully straightforward and generic, though i'd like to submit a PR that switches to ivi's more html-like tagged templates: https://github.com/localvoid/ivi#html-template-language

xania commented 1 year ago

e.g. at Grafana this is the norm

I can see why 'refetching' in Grafana fits best to it's use case. I don't remember I had to build a dashboard, probably I never did in my >20y of programming. Maybe it's just me.

i think these imperative-type APIs

Xania is really 100% declarative, let's remove all the noise and look at what xania does when e.g. adding a new rows

function append<T>(items: T[]) {
    return {
       type: "append",
       items
    }
}

So, this code is defining the problem in a declarative way, the intent is never lost, it is up to the runtime to decide how to perform this operation in the most efficient way as Xania does.

The discussion here is really about precise vs generic.

The pattern I like to use most of the time is load data once, and dispatch update actions (directly or indirectly) to the server (with optimistic UI updates). The server then responds with clear messages that describe precisely what happens there, given the context of the initial request to drive whether revert/retry is needed or not. Actions makes it overall easy to work with.

In contrary to "Something is changed, good luck finding out what I have changed" only makes things more complicated.

In Xania I am NOT looking for minimalistic abstraction, instead I am looking for precise abstraction.

leeoniya commented 1 year ago

what you're describing can only be precise in simplistic cases, or only when the UI initiates the explicit user-directed intent, like drag/drop.

once you involve a server (not an edge case) that responds with json to various requests like "filter cars", then "filter red cars < $50k, sort by price asc", you end up without any precise instructions about dom re-arrangement and you have to do generalized diffing.

my point being that precise/observables only get you so far, while generalized gets you everywhere. my preference is to use a uniform path in every case if the performance is there. React and vdom is popular for this exact reason since you dont need custom jquery code for append vs delete vs reverse. but react is slow, so people started making a case for fine-grained observables, but these have their own cognitive load and footguns (e.g. Solid replaces rules of hooks with rules of spreads, proxy wrapping, custom stores, etc.) and purpose-specific code structures (this discussion about the deserialized JSON cases requiring completely different approaches and worse performance characteristics).

anyways, it's my personal opinion/preference, not some kind of fundamental truth. different folks will prefer different tradeoffs.

all i can tell you with some level of certainty is that the swap metric was never about the swap itself. it was about triggering a slow vdom path in many vdom frameworks. so any bench implementation or framework that treats "swap" as some kind of explicit user-directed intent or store action is missing the point. that swap metric should effectively be treated as "randomize" or "arbitrary reorder".

xania commented 1 year ago

once you involve a server (not an edge case) that responds with json to various requests like "filter cars", then "filter red cars < $50k, sort by price asc", you end up without any precise instructions about dom re-arrangement and you have to do generalized diffing.

Also my opinion, but I value declaring the intent like "filter cars" explicitly in my domain instead of relying on generic solutions.

This is what I belief the confusion comes from. "filter cars" by itself is a precise instruction, in contrast to "Well, i filtered the list, I know precisely how and I could tell you how, but I won't, go and figure it out yourself given the total list".

xania commented 1 year ago

all i can tell you with some level of certainty is that the swap metric was never about the swap itself. it was about triggering a slow vdom path in many vdom frameworks. so any bench implementation or framework that treats "swap" as some kind of explicit user-directed intent or store action is missing the point. that swap metric should effectively be treated as "randomize" or "arbitrary reorder"

Totally agree on this point. At some point I will change my submittion to treat swap as "arbitrary reorder". I may change other parts too, to comply better with the intent of the benchmark, not because it is wrong to be precise generally speaking.

leeoniya commented 1 year ago

Also my opinion, but I value declaring the intent like "filter cars" explicitly in my domain instead of relying on generic solutions.

how would this work with arbitrary user-supplied filter combos and unpredictable server responses (due to constant inventory churn). think of a kayak.com search built as an SPA that relies on json responses for constructing and sorting results. what precise action maps to "render search results while reusing/updating existing rendered results"?

xania commented 1 year ago

think of a kayak.com search built as an SPA that relies on json responses for constructing and sorting results. what precise action maps to "render search results while reusing/updating existing rendered results"

The question is if you have a new list to render but the precise action that leads to this new list is not known then what to do?

Well, then just reconcile the new list with the old list. My point of being precise includes being able to refetch and reconcile when that's precisely what you need to do. There are many levels of abstractions and perspective, none is correct when considered all by itself.