krausest / js-framework-benchmark

A comparison of the performance of a few popular javascript frameworks
https://krausest.github.io/js-framework-benchmark/
Apache License 2.0
6.62k stars 820 forks source link

unfair benchmarks vs real life tests #22

Closed boussou closed 7 years ago

boussou commented 8 years ago

I spent a couple of hours going over the last additions of the benchmarks.

I noticed one thing that may be important: Some frameworks have an extra overhead the first time you interact with the DOM. You can observe that when you open the console and click several time the "Create 1,000 rows" button.

Some frameworks get a 5x factor on the next calls (the 1rst call is very slow). Among them:

Some have always the same performance:

I am still wondering whether it is unfair, or not? I think the current way of doing this test (inserting a lot of rows on a blank page) is close to a real life test. Do you?

leeoniya commented 8 years ago

according to [1], the actual benchmarks are run after 5 warmup rounds (and dont relay on the console output but use the actual timeline numbers) which should be fair with respect to any JIT optimizations that need to hit hotspot call thresholds.

that being said, i think there's a lot more that can be done to ensure that all implementations adhere to the same rules to make the benchmarks more representative.

[1] https://github.com/krausest/js-framework-benchmark#about-the-benchmarks

boussou commented 8 years ago

not sure what's behind the so called warmup.

krausest commented 8 years ago

I hope the benchmark is not unfair. At least as far as I'm concerned I'm not biased to any framework. There are some benchmarks that measure hot and some that measure cold performance. The create benchmark just loads the page and measures the first click to create 1,000 rows, the replace all benchmark does the same but for the 5th click. Both numbers are equally interesting. If one framework is currently discriminated it's react v15, because of the addition of the clear rows a 2nd time benchmark, which happens to be a very bad edge case for react.

boussou commented 8 years ago

We all know it is very difficult to do fair benchmarks. that was the purpose of my remark.

Looking at your code anyone can see you're not biased at all.

I tried to improve the ractive version because I like that frwk. Failed attempt. That's when I observed those differences. Now I want to see a version with domVM.

your idea (this benchmark) is IHMO very useful, closer to real life usage than the dbmon tests (http://vuejs.github.io/js-repaint-perfs/)

krausest commented 8 years ago

I just found an issue for ractive: https://github.com/ractivejs/ractive/issues/2366 If you wanted to try their suggestions and make a PR you're very welcome!

boussou commented 8 years ago

Saw it. It informed me about the problem when removing a row : That's awful, sometimes it take 10x the slowest ;-(

That becomes obvious when you generate 10K elements, selecting an element is quite fast, but the delete take ages.

Vue2 and DomVM perform far better in that aspect (and better than the react flavors)

I tried their suggestions on my machine:

doing both has impact on the performance (divided /3). But it is still very slow (and you loose the color of the selected)

There is a PR that seem to solve this, let's wait for it.

https://github.com/ractivejs/ractive/pull/2372/commits

localvoid commented 7 years ago

I've looked at this benchmark, the main bottleneck for fastest libraries is a bootstrap.css, this css library is quite bad for dynamic web sites. And because of this, implementations that doesn't preserve internal state when they perform updates are faster, because they don't trigger long running recalculate style in some cases, for example when domvm removes row, it actually removes the last one and updates others :) Or in swaprows use case, some libraries just update dom nodes, and don't swap them, and for example if we had material buttons with ripple animations inside this rows, implementations that doesn't swap rows would be broken.

leeoniya commented 7 years ago

@localvoid to be fair, i specifically asked about what was required of libs [1], including row swapping (since i saw a lot of differences in impls) and was surprised with how much leeway there was. all the things you point out are valid, but are fair game for this bench, including apparently custom store implementations.

it seems that only the resulting dom matters and theoretical "if there were animations" cases do not. so feel free to ignore untested cases and only optimize for the final dom shape. in domvm's case it was faster to update the nodes in place for swap rows rather than doing by-index lookup. domvm's impl doesnt track keys at all, it just stupidly reuses/mutates the nodes in-place...which is why remove row does the massive update followed by removing the tail.

I think the requirememts should be spelled out in the main readme.

[1] https://github.com/krausest/js-framework-benchmark/issues/9#issuecomment-232443783

trueadm commented 7 years ago

@krausest @leeoniya I feel that there needs to be a certain standard set for these benchmarks. Otherwise you get implementations that are built specifically for the benchmark – which is hardly an accurate representation of how libraries do in the real world. Maybe I'm alone with this opinion!

localvoid commented 7 years ago

to be fair, i specifically asked about what was required of libs

If you didn't implemented it this way, I'd probably never opened dev tools to investigate it a little bit further :)

so feel free to ignore untested cases and only optimize for the final dom shape.

If we remove unoptimized "bootstrap.css", "optimized" implementations will be significantly slower on use cases like "remove" :)

leeoniya commented 7 years ago

@localvoid yeah, the css should be as simple as possible

@trueadm no argument there, i said this from the get-go.

that being said, "only eventual dom shape" is a fair thing to expect in "real life cases" the vast majority of DOM manip does not need to account for animations or even carrying over transient dom state when "swapping rows". likewise, dbmon rows have a fixed layout and dont need to be tracked by keys, etc. so also work optimally with dumb in-place updates.

i do need to optimize domvm's internal by-key lookups though, maybe i'll steal your guys's [nicely commented] impls ;)...however in domvm the keys can be objects/models, so would need Map/Weakmap support which is lavking in IE9 (but can be polyfilled somewhat)

boussou commented 7 years ago

as said @trueadm , we need demo & bench that keep the spirit of the frameworks – close to a representation of how libraries do in the real world. From that point of view, I think we may consider reducing the tests number?

The dbmon test contains implementations designed for the bench, so that become useless IMHO. In example, the Marionnette's is faster than the backbone's, which is a nonsense. If you look at the code, you'll discover that the former use a requesAnimationtFrame optimisation, this is something nobody uses to maximize browser compat.

Here I am not happy with the vanilla versions, they should be the fastests. And there should be a jquery version too IMHO (v3 with event delegations), I may find time to write it in august.

leeoniya commented 7 years ago

@boussou you end up with two types of benchmarks. one which keeps the spirit of the frameworks (descriptive) and allows each to create an optimal implementation given some spec for the behavior. and the other one that keeps the spirit of the benchmark (prescriptive) and insists that elements must be tracked by keys, row swapping needs to detach/move actual dom nodes, etc.

in one case the benchmark loses value in that frameworka can "cheat" and have optimized impls. in the other case you're forcing frameworks into sub-optimal patterns they're not geared for. one way is better for pure benchmark/apples-to-apples comparison and the other way is better for "real life"/actual code a dev would write for a specific case. what is "fair"? is it a balance? is it either of the extremes?

for example, is "adding 1k rows to an already rendered 10k rows" a reasonable thing to test? i think not. certainly most people would paginate, infinite scroll/occlusion cull. this to me is not real life and being slow at it isnt a representative case but purely synthetic. synthetic tests are far from meaningless (efficiency matters), but tens of thousands of devs use React every day without perf issues :/

you could even take the position that unreasonable test cases justify unreasonable optimizations/cheating.

/philosophy

localvoid commented 7 years ago

In real life, users also may have browser extensions, that can have a huge impact on performance for some libraries, depending on the way they manipulate DOM. Also, in real life there are many different browsers, and for example kivi is using some weird stuff to make it faster in webkit/blink (I only care about perf on popular mobile browsers), but this weird stuff will make everything a little bit slower in gecko :)

krausest commented 7 years ago

I'm open for suggestions if I understand their value. Replacing the CSS would certainly be viable if that helped to differentiate the framework , but I'd like to keep the table nicely formatted (e.g. alternating colors for rows). As for other suggestions please make concrete proposals and I'll see what I'll add to the spec.

leeoniya commented 7 years ago

@krausest you can keep the styling but throw away all unused styles; bootstrap is huge.

the dbmon bench was stripped down in this commit: https://github.com/mathieuancelin/js-repaint-perfs/commit/b5f53f05f6fd5101c7b6a4cb14c9710b21b6ad5a

EDIT: whether this should be done here is not clear since most devs will just include Bootstrap wholesale in 99% of real world cases.

localvoid commented 7 years ago

I think that comment that explains all this details would be enough, maybe someone who knows english way much better than me can make a pull request and update readme :)

Here is a huge differences between "vanilla"/some other implementations and "Inferno"/"kivi"/etc:

In all this cases, libraries that do "the right thing" trigger long-running recalculate style, that is why they are significantly slower. If you try to do it other way in React/kivi, there will be a warning message in console in debug mode, because it is one of the most common errors when you lose internal state.

Inferno/kivi performs diff/lifecycle calls in less than 5ms on 1k rows, and kivi will perform ~minimum dom operations (in the last patch I even implemented it so that it can detect full replace and remove all nodes with textContent=""), so if there is a huge difference in results, it is probably that other libraries just perform completely different dom operations.

leeoniya commented 7 years ago

@localvoid

it is one of the most common errors when you lose internal state.

you're assuming there is some internal state that must be maintained here. there isnt. and you cannot fault any libs doing what they need to acheive the goal, which is a prescribed final dom state. the "right thing" in this case is ignoring state if it means doing things faster.

had this bench included transient state that had to be maintained (partially filled text inputs, css transitions, externally added classes, etc), then there would be no choice but for libs to invoke more expensive lookup/trasplant logic.

this bench appears to only care about the final ui state. if kivi or inferno can be adjusted to do whatever is cheapest, no one is stopping you. the exact dom ops performed are irrelevent as prescribing this would cause non-idiomatic code and possibly slower perf in other impls for no reason WRT to this bench's goals.

localvoid commented 7 years ago

@leeoniya

It has nothing to do with this benchmark, it just one of the most common errors in react-like libraries by average web developers, and that is why libraries like React are forcing users to do "the right thing".

I agree that if your library supports such "optimizations", and if you in a situation when you are forced to use shitty css file in your project, then you can make any crazy stuff that makes your users happy. In kivi I can also disable this warning with special method, but non-keyed updates are so boring, there is nothing to optimize :)

EDIT:

more expensive lookup/trasplant logic.

The problem that in "remove" case, it is more expensive because of shitty css, if we remove this css, non-keyed variant will be waaay much slower than those who do "the right thing".

EDIT2: And if we replace "swap rows" with much more common use case like move 1 row from somewhere near the bottom to somewhere near the top or vice versa, non-keyed variant also will be waaay much slower than those who do "the right thing".

krausest commented 7 years ago

Once I had a version with a "delete next row" and "delete previous row" button in each row. I wanted to assert that the focus stays in the current row. Back then all frameworks did just that so I didn't commit that. So kivi would lose the focus? Maybe I should add something like that again and add a rule to the benchmark...

leeoniya commented 7 years ago

in the current domvm impl, the focus would remain in the same physical DOM slot (which may now represent a different item). If you "delete previous", then it would be incorrect; if you "delete next" it should stay correct.

any impls that track by keys (likely most of them including kivi, i dunno) would stay correct in both cases. if you change the requirements to include this, then i'll update the domvm impl to work properly (it will be slower, which is why i chose the fastest impl that meets the requirements). again, anything not explicitly specified is fair game to optimize as authors see fit - this is why we're all asking for clearly defined rules.

localvoid commented 7 years ago

So kivi would lose the focus?

No.

Maybe I should add something like that again and add a rule to the benchmark...

There is no need to add something like that, it is easy to check in dev tools profiler when someone is actually moving nodes, or just updating them. As @leeoniya said, right now this benchmark is biased towards implementations that don't "track by keys", I think that it is possible to disable this feature in most of the implementations, some libraries just implicitly track by object identity. Implementations that swap rows in ~50ms are using track by keys.

leeoniya commented 7 years ago

@krausest have you given this any further consideration?

the perf table would be unrecognizable if it was to compare apples-to-apples. Have you tried disabling track-by-keys in the "slow" impls and vice versa?

krausest commented 7 years ago

@leeoniya Let's try to work it out. As the first step I'd like to simulate what a "track by keys" vanillajs implementation would look like. I think there are only two benchmarks that are slower for "track by key": swap rows and remove row. If the vanilla.js implementation actually moved the dom nodes for both cases I guess it would simulate "track by key" then. I've committed such a vanillajs-keyed implementation (swap rows and remove row. Results are here. The columns for vanillajs-keyed and vanillajs are a bit surprisingly placed. As expected both benchmarks are slower for vanillajs-keyed, but still sub 16 msecs for swap rows and ~30% slower for remove row. Maybe you could confirm that this is a valid baseline for a track by keys implementation.

As the next step I'd suggest

leeoniya commented 7 years ago

Maybe you could confirm that this is a valid baseline for a track by keys implementation.

Looking at the code, it looks ok; nice and imperative.

try (at least for the big frameworks) to run without track by key

this should be easy

decide if we find a good functional requirement to require track by key

usually this is important for maintaining external dom state that's not handled by the vdom lib, or to maintain css animations/transitions by avoiding detaching. some examples:

[1] https://developer.mozilla.org/en-US/docs/Web/CSS/Pseudo-classes

trueadm commented 7 years ago

Actually, I'm going to suggest something extreme here. We build a better benchmark, that we all design beforehand as a real-world scenario. @krausest you've done awesome work on this, but I'm also somewhat bored of having yet-another-table-cell rendering benchmark. It's probably the most unrealistic real-world use case – I mean who renders 10k rows on a page?

May I suggest we in fact create a benchmark that generates a social network/news feed website. You could have a search box in header that auto-completes as you type, you have a list row view for posts/news, you have sidebar entries etc. Plus there's proper routing between pages (say 2-3 pages for the benchmark).

We could use react-hn as a reference point here.

Furthermore, such exploits/cheats before would be useless in some areas here. A social network/news feed site is going to have things that need state or they literally blow up – such as anything with an animation, even images can completely break if you try to non-key update them.

The point is: if we build a benchmark that is literally a great real-world use-case, hacks/cheats will be actual real benefits for once, rather than something to make the benchmark faster – why? Well, because if it's making a real website faster, you're on to something. It's obvious.

There's no reason we can't all chip into a project like this. I feel the current benchmark is already maxed out and I think we can all contribute on to a project that can really help push a decent benchmark that is used for many years to come.

Thoughts? @localvoid @leeoniya @krausest @boussou @gaearon

leeoniya commented 7 years ago

@trueadm

Something kinda like this already exists (minus a robust benchmark), look at ThreaditJS [1]. It's still mostly lists but does cover a lot more (including server error handling, forcing you not to block the UI thread, ajax, talking to an API, markdown/escaping/raw html, etc).

It may be worth building on top of this to add some extra features that are real world, like auto-complete, embedding external content, interfacing with third party libs, like TinyMCE or markdown composers.

This is a worthwhile end goal but is high effort / high reward; As with most benchmarks, there's set amount of time lib authors will spend to do a full implementation, and i think ThreaditJS is fairly close to that limit if you're careful to handle all "real-world" cases it throws at you. Once a fast impl of an involved bench is made, there will be little incentive (other than lib tutorials) to use it as a benchmark. I will also point out that the differences between very fast and very very fast libs in "real world" cases will be minimal, since most "heavy" properties are < 5000 dom nodes. (try document.querySelectorAll("*").length on your favorite bloated page.).

We can of course recreate a much faster Facebook UI, and claim it's "real world", but I think that would discourage authors with limited resources/communities/contributors from dedicating a huge amount of time to show they can compete in this already crowded space.

I'm not necessarily against it, but getting this benchmark to be more apples-to-apples is both, low effort and high reward.

cc @koglerjs

[1] http://threaditjs.com/

trueadm commented 7 years ago

@leeoniya I really don't want to benchmarking lists though. We really need to get out of that mind frame. I'd rather benchmark a mostly static website that looks like BBC, than benchmark a AJAX/WebSocket/InfiniteList bla bla page of rows. There is literally no website I view, almost ever, that is composed only of rows of single text nodes – even Hacker News and Reddit is way more complex than that!

If it's the case that people can't be bothered to collaborate (with me) on creating a benchmark like this – it shows that there's a real lack of drive to create realistic benchmarks in the community. It's not even about time, this project may take a long time, but some enthusiasm would be good.

leeoniya commented 7 years ago

Here's the crux of the issue. There's "benchmarks", which intentionally stress-test even the fastest frameworks. And there's "ergonomics" or real-world demos. Given the performance we now have with the latest versions of many libs, "real world" demos will no longer sufficiently stress any of them. You may get situations of 5ms vs 15ms but no one cares if the 3x perf difference is indistinguishable to the user.

If you want something that's both real-world and a stress test you have to really try hard. Recreating Google Sheets would do it because [to no one's surprise] it creates a fuckload of nodes. Rewriting Codemirror or some other doc editor that has has 30k nodes for syntax coloring / styling would also be a real world stress test. But again, most of this is just adding complexity to a benchmark that can achieve the same without it but makes it very very hard to implement on a whim.

I really cannot think of something that is not a new idea or intentionally artificial that will act as a benchmark which provides more benefit than what exists today. In addition, "real world" these days needs to account for mobile rendering, so no matter how fast your framework is, you're gonna have to limit your dom complexity, minimize payload and reduce content in all cases where you don't want to fall over for non-framework reasons.

trueadm commented 7 years ago

@leeoniya I disagree, it's not about the amount of DOM nodes at all. It's much to do with the implementations and optimisations. For example, we've built an app with React, tried it with Inferno it gave 40% faster page loads. We tried it with Preact and it was the little above 10% faster compared to React. We even tried Vue2 and it was even worse than React.

Furthermore, we refactored the codebase to not use classes, purely functional components and a single state tree and the performance on startup was almost ~500ms on Android Nexus 5 compared to ~1100ms wth classes and state in each component. I could preach about this all day long – as I've been doing this stuff for such a long time and it's fun :) but there gets to a point where you say "huh"?

Also the biggest performance to be gained: monomorphism. It has such a big impact on Android, not so much iOS though. DOM recycling on Safari and Chrome is immensely fast though – you just need to make sure you give each VNode a key, without keys everything can break very quickly. Unfortunately, DOM recycling can easily get you into trouble with iframe, audio and custom elements. Inferno can thankfully deal with input/select/textarea elements properly as they are controlled.

On that note: I'm surprised that other libraries don't try and control input/select/textarea – good luck when you get around to building real apps :)

leeoniya commented 7 years ago

@trueadm you basically just described the exact effect of switching this benchmark from one framework to the another (or one impl to another) [1], but have added a bunch of extra complexity to prove the same point. (SSR concerns aside, but there's no reason SSR cannot also be tested by this bench)

[1] https://rawgit.com/krausest/js-framework-benchmark/master/webdriver-ts/table.html

trueadm commented 7 years ago

@leeoniya I described what happens when a huge major company like Tesco (who I work for) do real-world benchmarks with many teams involved. I already explained our findings in my last comment. If I were to tell you the financial loss in our page simply loading 1s slower on mobile devices, you'd be absolutely gobsmacked.

I'm suggesting we take such scenario, where we are benching our actual application and make a better "benchmark" form it. Authors will massively benefit from it. Users of the frameworks will benefit from it.

leeoniya commented 7 years ago

What you described is what was already known from this benchmark and a bunch of others. Except you added a lot of extra [enterprisey] words to say the same thing. I understand that if you want to sell your vision/idea, this extra marketing fluff and "real world" corporate synergy are necessary. But they are quite unnecessary to demonstrate performance and certainty unnecessary for devs to sweat over for proving their worth.

I agree that lib consumers and corporate entities need more than a todo list demo to be convinced to switch from one core technology to another. But it does not supplant the current [boring] benchmarks which are both easy to write and very effective. I have yet to see a lib that is fast on a broad range of boring benchmarks but inexplicably slow in real life.

What I don't want is for a fully-featured Facebook UI to become an necessary perf standard that "only" takes 2 weeks to MVP. Yes, such a thing is good to have to show off framework ergonomics but nothing you've said makes me believe that it would be objectively more representative of performance than what we already have.

trueadm commented 7 years ago

@leeoniya W're on a different wavelength here. I think you lost me a few messages back haha. Maybe I'll just summarise my ideal benchmark:

A benchmark that has more than 5 nested components building its UI. Using more than "click" events – such as onmounsemove, properly using "aria" attributes and and ensuring tabindex is applied correctly as per accessibility guidelines. Ensuring stateful elements are dealt with properly, i.e. css effects, iframes, inputs. Ensuring components can render components with immediately nested component children (if you read @localvoid's tweets, only React supported this properly, all other vdom libraries failed, including the domvm). Ensuring that event updates are handled in microtasks etc.

leeoniya commented 7 years ago

Ensuring components can render components with immediately nested component children (if you read @localvoid's tweets, only React supported this properly, all other vdom libraries failed, including the domvm). Ensuring that event updates are handled in microtasks etc.

These are implementation details and enforcing them on all frameworks is disingenuous. For instance, there are numerous useful features in domvm that are not supported cleanly (or at all) by other libs and framing the other libs as deficient because of it would be unfair. Since domvm is plain javascript, you can accomplish pretty much whatever you need with minimal additional contortions. It has just gone through a full internal rewrite and some of these concerns can be addressed in the near future if they become necessary.

Again, there are lib ergonomics (which matters to devs) and there is performance (which matters to users). Performance is objective and can be measured but the means of getting that performance can vary by framework, impl architecture and by paradigm. We need to give enough freedom to authors to write implementations as they see fit to achieve a specific spec. Catering a demo app to the unique features of a specific lib is not honest. How the internals are structured to get clean-room spec conformance is an orthogonal [and subjective] metric.

dfilatov commented 7 years ago

@trueadm

Ensuring components can render components with immediately nested component children (if you read @localvoid's tweets, only React supported this properly, all other vdom libraries failed, including the domvm).

Not only React, checkout Vidom, it handles this case easily: https://jsfiddle.net/dfilatov/0mpb6jpg/6/

leeoniya commented 7 years ago

@dfilatov i'm not sure what your example is demonstrating.

I think that statement means returning some form of fragment from a "wrapper" component (either of vnodes or sub-components) which means it itself has no dom representation. Something like this where A,B,C are wrapping components with no own dom roots. @trueadm can clarify perhaps.

A
  B
     D
        <div>foo</foo>
     E
        <div>bar</div>
  C
     F
        <div>baz</div>
     G
        <div>moo</div>
dfilatov commented 7 years ago

@leeoniya

i'm not sure what your example is demonstrating.

I've forked preact fiddle (https://jsfiddle.net/localvoid/h84zogmc/), which demonstrates broken behaviour, replace preact with vidom and everything worked the same as with basic react fiddle (http://codepen.io/localvoid/pen/WojeKN).

Could you point where I missed?

leeoniya commented 7 years ago

Could you point where I missed?

This component only returns another component, while each of your components returns a dom root node.

class B extends React.Component {
    shouldComponentUpdate() {
        return false;
    }

    render() {
        return <A />;
    }
}
dfilatov commented 7 years ago

@leeoniya

https://jsfiddle.net/dfilatov/0mpb6jpg/6/ ok?

leeoniya commented 7 years ago

yeah, that looks legit (to my untrained eye) 👍

bonus points if you can return fragments from render(): [node('span'), node('div')] or [node(X), node(Y)]...and if those themselves can return fragments.

dfilatov commented 7 years ago

bonus points if you can return fragments from render(): [node('span'), node('div')] or [node(X), node(Y)]...and if those themselves can return fragments.

Vidom has special fragment node (which doesn't produce additional DOM element) for this purpose:

onRender() {
  return node('fragment').children([node('span'), node(X)]);
}

or with jsx:

onRender() {
  return (
    <fragment>
      <span/>
      <X/>
    <fragment>
  );
}
leeoniya commented 7 years ago

nice, i have some similar work pending [1]. anyways, let's get back on topic: benchmarks.

[1] https://github.com/leeoniya/domvm/tree/defineFragment

krausest commented 7 years ago

Let's do that. I created a new version for angular 1, 2 and react. One without track by or key prop (let's called that nokey) and one that uses the index as the key (indexkey). Here's the code for indexkey: Angular 1: item in homeController.data track by $index Angular 2:

*ngFor="let item of data trackBy itemById"
...
itemById(index: number, item: Data) {
      return index; 
 }

React:

let rows = this.state.store.data.map((d,i) => {
            return <Row key={i} data={d} onClick={this.select} onDelete={this.delete} styleClass={d.id === this.state.store.selected ? 'danger':''}></Row>
        });

Results are here For angular nokey has no impact on performance (it tracks by identity then), for react nokey and indexkey are equally fast.

Some things are surprising:

Any comments so far?

annothernother commented 7 years ago

Thanks @leeoniya for tagging me, this is an interesting conversation.

A few thoughts:

There is literally no website I view, almost ever, that is composed only of rows of single text nodes

And there is literally no website that is not a tree, so there isn't really anything more "real world" than a comment thread.

Using more than "click" events – such as onmounsemove, properly using "aria" attributes and and ensuring tabindex is applied correctly as per accessibility guidelines. Ensuring stateful elements are dealt with properly, i.e. css effects, iframes, inputs.

These are interesting ideas that I'll keep in mind.

For an example of what I'd love to have for ThreaditJS that could incorporate some of these checks: a script that times its runs on implementations and is required to succeed to 'pass' an implementation and move it to the official list.

If it's the case that people can't be bothered to collaborate (with me) on creating a benchmark like this – it shows that there's a real lack of drive to create realistic benchmarks in the community.

I'd suggest that your conception of "realistic" is in fact very particular.

From my birdseye view of VDOM magic, the benchmarks are just one part of what makes a good framework or library. I don't actually consider ThreaditJS a benchmark; it's purpose is to provide a side-by-side comparison of SPA tech, of which benchmarks are just one component. My unsolved problems right now are in the direction of measuring developer overhead: the weight of scaffolding and build tools.

I was enthusiastic about ThreaditJS about a year ago, but benchmarks don't pay the bills. I plan on revisiting the project in January with a second round of frameworks and some assorted improvements, including to the limited benchmarking that currently exists. How much time will I have? I don't know, but while I appreciate your 1s gobsmacking concerns, I'm more concerned with numbers that demonstrate the clear difference between, say, Inferno and Ember.

(Which is another way of saying, from my point of view, one fast VDOM library is as good as any other.)

trueadm commented 7 years ago

@koglerjs it sounds like you're going on the defensive regarding my mind storming (that's literally what it was). I wasn't being critical of ThreaditJS, in fact, I wasn't really talking about your work at all. I was talking about the state of benchmarks in JS land vs that of say GPUs. In GPU benchmarks they replay a battle/scene from their actual game. That's the sort of thing we should be aiming for.

This is a topic for debate over hangouts or in person. Not really for a GitHub thread. There are so many interesting points people have - would be good to flesh them out in a meetup.

annothernother commented 7 years ago

Oh I didn't take it as criticism of ThreaditJS! My thoughts on ThreaditJS got mixed in with my thoughts on benchmarks in general. People would be easily forgiven for thinking ThreaditJS was a dead project, so any criticism of it is almost moot anyway ;)

krausest commented 7 years ago

While I'd really love to see a realistic benchmark (and if it existed, I had probably spent my time with some other endeavor) I don't see any realistic chance of evolving this benchmark into such. On the other hand I think this benchmark has a value (not by looking at who's fastest, but by looking who's real slow and who manages to improve) thus I'd like to make an incremental improvement. I create issue #89 for such a discussion and leave this issue here for general discussion. @leeoniya, @trueadm, @localvoid : I'd love to hear your feedback

solkimicreb commented 7 years ago

Hi, I am the author of NX. I am new to this community, but here are my 2 cents.

I think the frameworks should adapt to the use case and not the other way around. In this example it means supporting tracked and non-tracked mode too. Frameworks supporting non-tracked mode should naturally use it to perform better (I am surprised they did not do it so far.)

On the other hand changing things now and then would be a pretty good thing. This repo helped me to find and fix performance issues in my framework and I guess others had the same experience too (based on the huge score improvements). I am sure that this change would reveal some other issues and would help me to fix them. Occasional changes would also reveal which frameworks are speedy in general and who puts effort into fixing the possibly occurring performance issues.

TLDR: I think the current benchmark is not biased, but changes are always welcomed to shake up the community a bit.

awerlang commented 7 years ago

There's two things I'd like to see in this benchmark:

I guess these two help spot inefficiencies where performance really matter, and would help remove bias (react sample uses a nested component with shouldComponentUpdate(), the angular ones don't). Other than that, it could kept its simplicity.