krausest / js-framework-benchmark

A comparison of the performance of a few popular javascript frameworks
https://krausest.github.io/js-framework-benchmark/
Apache License 2.0
6.8k stars 840 forks source link

make vanillajs great again #403

Closed leeoniya closed 6 years ago

leeoniya commented 6 years ago

1.00 (besides rAF clearing)

ryansolid commented 6 years ago

I agree. Not only having a clear target is good. It helps library writers and test implementers see at the core level the techniques to best deal with the DOM. Admittedly some of the stuff is Chrome specific (and I'm not just talking about rAF), as each browser performs differently, but realistically all you can do is optimize for the platforms the tests run on.

One thing I will mention is that I develop on a lower power machine than the benchmarks run on and VanillaJS is consistently better than any of the frameworks. Even more noticeable when the cpu is being throttled in power saving modes. The more powerful the computer I've found the closer to Vanilla the libraries get. But I guess I wonder if the frameworks are getting so optimized it's the test themselves that need looking at again.

Looking purely at the numbers I'm guessing partialUpdate, replaceRows and swapRows are the areas Vanilla can improve the most. Surplus has partialUpdate pretty convincingly, ivi and inferno seem to have an advantage on replaceRows and attodom has swapRows dialed. I don't have anything to help swapRows but perhaps there are a few optimizations I can suggest to help the others.

I did come across a few things when working on Solid that I either JSPerf or found other libraries doing that seemed to help that VanillaJS doesn't do. Keep in mind I tested these independently and didn't do enough comparison tests in the context of the benchmark to conclusively say in all cases these are better.

I hope that helps. (But maybe these haven't been implemented already for a reason).

leeoniya commented 6 years ago

But I guess I wonder if the frameworks are getting so optimized it's the test themselves that need looking at again.

i've wanted to toy with the idea of measuring JS execution time only and throwing away layout and paint times. this has the effect of showing a much wider, more realistic gulf than the current numbers show (where most of the time is dominated by layout).

The more powerful the computer I've found the closer to Vanilla the libraries get.

i've noticed this too, working on a slower i5 lenovo thinkpad with HD4000 and Windows 10 x64.

at the end of the day, my goal is to have a single target to build, bench and compare against instead of spending a bunch of time and resources assembling the fastest metrics.

I hope that helps

thanks for itemizing these, feel free to PR them, too ;)

ryansolid commented 6 years ago

I think we are at a bit of cross roads here. In one view the libraries have gotten to a point where they are not terribly distinguishable from VanillaJS in the practical case. This is sort of a good thing. This benchmark has succeeded outlining the areas where libraries can improve.

I think that leaves a few options.

  1. Leave as is, since it's serving it's purpose.

  2. Look for opportunities to up the number of rows, load on some tests. My concern there is it's not indicative of real world scenarios anymore. Certain optimizations work differently at different scales and it gets really obvious when you move to like say 30,000 rows. It might be enough to move some more cases off the 1000 rows to a higher number.

I know it's near impossible to accurately measure but continuous actions is one place where Vanilla can still destroy libraries. Every seen the old Circles test(Looks to be broken now). Much copied and not terribly accurate measured (only tracks the main thread work). I made it time the whole loop time instead and optimized the VanillaJS implementation and bumped the circles to 300 and none of the frameworks can come particularly close and those that perform well there don't necessarily reflect the top partial updates here, although it helps. I actually built my library optimizing that case above all others. Doesn't really fit with the nature of these benchmarks though I think.

  1. RIght now VanillaJS is written almost the way a data driven library would be written if it cut corners and just tried to do the task it might be able to get a little bit more performance. I'm not sure, I'm just guessing. I like the readability of the current implementation and I'm not sure we should go too crazy here.
leeoniya commented 6 years ago

Certain optimizations work differently at different scales and it gets really obvious when you move to like say 30,000 rows

The issue is, the more DOM you add, the more you drown out framework differences in browser time spent on rendering/layout/GC, etc. Framework differences are already statistically visible at 1,000 rows if looking at JS timings only. The clamping done by this benchmark doesn't do us any favors here, and i've argued for its removal or effect minimization in https://github.com/krausest/js-framework-benchmark/pull/335

I think it's silly to introduce additional load that skyrockets the bench runtime by impacting browser perf disproportionately. I would rather use the sum of 10 rounds of a cheap 1k bench instead of the average of 10 rounds of a 10k bench. I don't believe simply throwing more DOM at these problems is the way to go. For instance, I get pretty consistent/expected results with UIbench [1], though it doesnt test event binding or simulate event interactions.

[1] https://localvoid.github.io/uibench/

localvoid commented 6 years ago

Certain optimizations work differently at different scales and it gets really obvious when you move to like say 30,000 rows

Why anyone would want to optimize library for >100k DOM nodes per page? Everything would be so slow just because there are insane amount of DOM nodes on the page, it wouldn't matter what framework is used, application will be completely unusable.

I've tried to optimize ivi for real use cases when the ratio of DOM nodes per component is low, and there are ~2k DOM nodes if it is a complex desktop app +2k SVG nodes if there are also many charts. I don't see any value in optimizing performance for benchmarks like this.

localvoid commented 6 years ago

My concern there is it's not indicative of real world scenarios anymore.

What you think about increasing the amount of data bindings to 1 per DOM element instead of 3 bindings per 8 DOM elements? Just looked at the source code of some small-sized web applications, and on average, there even more than 2 bindings per DOM element :)

ryansolid commented 6 years ago

Sorry I meant if we were to add more elements my concern is that it wouldn't be indicative of real world scenarios. I don't think I was clear as both of you took me to be saying the opposite.

I think the idea of bindings per element is interesting. I've found that bindings per elements has been pretty variable depending on other concerns around layout. I will say as CSS has gotten more flexible we've definitely seen a drop in elements. What hasn't really changed that much though I image is bindings per row in a scenario like this. We are definitely a binding or 2 light per row. We're definitely missing hover states which usually accompany every interactable affordance. There is something to be said about animations too, although I deal a lot more with image grids/layouts than data tables which I don't imagine one usually fades rows in one by one but maybe they do.

How to incorporate this sort of things without adding too much distraction to the benchmarks, especially for the ones that interact with the rows directly is interesting. Especially since handling the rapid succession of these small changes would benefit some approaches but ultimately muddle the results of the specific test. I think the libraries that would be most impacted in the creation step by adding these sort of bindings would be able to best handle them in the update, but as I mentioned before the smoothness of this sort of continuous interaction might be hard to measure.

============================================

I think it's pretty clear swap rows seems to be the one of the most variable places that VanillaJS struggles with. I'm not sure what it is exactly but it does seem to consistently score lower than the libraries here. I think if we can address this VanillaJS will atleast be sitting out ahead mostly.

Freak613 commented 6 years ago

1) If you take a look at the performance tab for Swap Rows test for any framework, from time to time you can see the gap between code execution and browser reflow. A lot of libraries doing swap rows really fast, but suffers from this random gap. Sometimes there is no gap, I think that means that in that case code suddenly hits some browser timings. It's unintended and uncontrollable behaviour. And what we can see in the results for this test, is just pure randomization, instead of measuring something meaningful. I don't know what exactly Chrome doing during that time, but triggering reflow manually(by reading some css props for example) allows to eliminate this gap, decrease time for this test as much as twice and produced stable results. It feels like dirty hack, but in other case this test could not be used to measure frameworks performance for above reasons.

2) To squeeze some ms, disable logging for tests, it shouldn't exists during testing, and has been used only for developer convenience in manual testing. Surplus, domc and maybe others already did it.

screen shot 2018-08-02 at 12 55 14
Freak613 commented 6 years ago

Similar situation with Select row test. For vanilla implementation, actual updates with reflow performed for ~3ms, while test suite measured ~10ms on average (because GPU waits for next frame to render).

What happens here I think, is that browser delays reflow until next animationFrame. Therefore, it's hard to get results in less than 16ms (because Chrome renders only in 60 fps max) Why Select Row has 10ms, it's because it's updates fits in one frame, with reflowing. But for Swap Rows, reflowing takes significant time and it doesn't fits in one frame, therefore takes 2 frames most of the time.

Only ivi performs update in one frame because of batching, and it's not trivial implementation to get it into vanilla. And, while adding batching or rAF to Swap Rows can sometimes win few ms, it will add up to 16ms to every other test case.

So, sadly, it looks like optimizing so small cases is unprofitable, until browsers will have unlocked framerate. It's strange, why we have locked 60fps limit. It's like playing console game port, while hardware is able to deliver much more frames. We're trying to be too fast than browser supports.

Freak613 commented 6 years ago

Chrome has --disable-gpu-vsync option, and running it delivers >60 frames But it behaves strangely in other way. Calling handler itself takes crazy amount of time. Looks at the terminal, it generates some errors there, so I think this is cause of slowness.

Maybe I was not right about hardware, because even my macbook has 60hz display. But for sake of performance measuring it could be a viable option. If not this handler slowness.

screen shot 2018-08-02 at 21 27 11
leeoniya commented 6 years ago

ok, here's a question i asked a few days ago (no replies yet).

https://groups.google.com/forum/m/#!topic/chromedriver-users/kOpU-OXkHPk

as you can see, i'm trying to figure out how to extract the cumulative "summary" from the perf logs rather than simply trying to detect events and measure the delta in ms. this means that we can perform 10x interations and simply collect the sums of rendering, layout, painting, js and gc.

if this was possible, i think it would be ideal since we dont have to worry about weird gaps introduced by chromedriver overhead or Chrome's internal scheduler/coalescing. we can then ditch clamping and other oddities.

i'll try summoning @paulirish again ¯\(ツ)

krausest commented 6 years ago

This benchmark had always taken a user centric view when measuring performance, i.e. wall time from click event until repainting finished, which was motivated by an old aurelia version that sometimes looked like that for a single click:

leeoniya commented 6 years ago

This benchmark had always taken a user centric view when measuring performance

i can't help but see similarities of this argument to one often made about benchmarks in general: "many frameworks are fast in benchmarks but that's not representative of 'real world'". my followup has always been that frameworks which performed well on a sufficient diversity of microbenchmarks would also do well on whatever "real world" metrics were being implied. no one has really come out to prove this otherwise. the opposite may very well be true (good on real world, poor in benchmarks), but that's not too relevant here.

as for that odd aurelia scenario, do we know whether there are still cases when a framework/impl just sits there for 100ms seemingly idle, doing nothing? even if so, there could be a way to account for this too in a "summary" type of aggregate analysis which also considers total idle time.

imo, frameworks that perform well in summary would be the same ones that come out ahead with the current strategy.

LifeIsStrange commented 6 years ago

@Freak613 It should be notes that disabling vsync create big tearing effects on screens, exept on most modern screens (they support freesync) but currently only AMD hardware support freesync, but Intel intent to support it too (and support it on kabylake G).

Also, most modern screens have 75hz refresh rate and I really hope that chrome guys are not dumb enough to use a fixed vsync instead of the screen refresh rate.

Mozilla made an experiment but absurdly removed the « feature » in the egg. Source: https://bugzilla.mozilla.org/show_bug.cgi?id=1370253

leeoniya commented 6 years ago

i consider this fixed by #472

LifeIsStrange commented 6 years ago

Optional request: On the web, i've seen many claims that 'use strict' allow significant optimisations and others claiming the contrary. It would be nice to just test with and without 'use strict' and show how % of performance is affected.

LifeIsStrange commented 6 years ago

Also, there could be an asm.js/emscripten version.

kurtextrem commented 6 years ago

@LifeIsStrange https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Strict_mode there is no need to claim stuff - it is a fact. When 'use strict' is missing, JavaScript engines can't make assumptions and/or turn off optimizations that aren't possible because of weird edge-cases.

leeoniya commented 6 years ago

in my non-through, many-moons-ago testing, it made no difference. i tend to agree with this answer [1]:

Because Strict Mode is primarily a parse time feature of JavaScript, your favorite browser isn't going to show much of a performance decrease when Strict Mode is enabled for some website (e.g., SunSpider). That is, the performance degrade occurs before code is executed meaning it could be perceptible to end-users but is largely immeasurable using the Date object to measure block execution time

at the end of the day, the JIT will do what it does regardless of mode. it's just that strict mode prevents the author from writing things that the JIT will likely de-opt. but i think the same exact code will run the same with or without 'use strict'.

[1] https://stackoverflow.com/questions/3145966/is-strict-mode-more-performant/6103493#6103493