Chrome 100 results are ... interesting

krausest commented 2 years ago

Results for Chrome 100 look pretty different. The fastest strategy for clear rows has changed and results for create rows are significantly slower. I think I have to take a closer look before publishing those results as official. If anyone finds the trick for clear rows please post it here ;-)

https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_preliminary.html

krausest commented 2 years ago

Looks like https://github.com/krausest/js-framework-benchmark/issues/166 Within a RAF call vanillajs clears in ~ 50 msecs.

krausest commented 2 years ago

Vanillajs can be improved significantly by wrapping create, replace and clear in RAF (left with RAF, right without). Not sure if I should commit this as it'll certainly spreads into benchmark implementations.

fabiospampinato commented 2 years ago

Personally I only see a bug in Chrome in here. Simply wrapping clearing in a raf is probably a bug even, now if you clear and append fast enough you won't see the appended rows. I don't see any reason to push frameworks toward this workaround for something that should probably be fixed in Chrome.

ryansolid commented 2 years ago

Yeah RAF has always been slightly faster and it has varied at different times. We made a mark for it because it clearly was gaming it. That being said there are a couple of frameworks that have lower scores on clear and I'm not sure in all cases it is just RAF. I looked at doohtml and didn't see an obvious RAF call.

Also worth noting is slight hit on replace rows. Most frameworks have taken a slight hit. sifr interestingly seems to be good on both these so I might try to see what the difference is.

krausest commented 2 years ago

Well, doohtml's implementation is particularly clear: https://github.com/krausest/js-framework-benchmark/blob/1e01af18b564bbd2df7386d40cea4d9dbe6db4c3/frameworks/keyed/doohtml/js/Main.class.js#L121 It's actually what vanillajs does https://github.com/krausest/js-framework-benchmark/blob/1e01af18b564bbd2df7386d40cea4d9dbe6db4c3/frameworks/keyed/vanillajs/src/Main.js#L228 Home come doohtml is 24 msecs faster? I'll try to profile this weekend...

krausest commented 2 years ago

I actually have doubts that the webdriver measurement is correct. First I removed the CPU slowdown. The webdriver results still show that doohtml is faster than vanillajs:

result vanillajs-keyed_09_clear1k_x8.json min 8.897 max 11.889 mean 10.4755 median 10.78 stddev 0.9645715514039265 result doohtml-keyed_09_clear1k_x8.json min 6.334 max 7.056 mean 6.6656 median 6.6165 stddev 0.3000385901105982

I prepared a branch for puppeteer. How do results look there (measurement isn't strictly comparable since it closes the browser after every benchmark loop): result vanillajs-keyed_09_clear1k_x8.json min 7.516 max 10.225 mean 8.3141 median 8.1485 stddev 0.84825709546104 result doohtml-keyed_09_clear1k_x8.json min 6.162 max 11.239 mean 7.912800000000002 median 6.6165 stddev 2.207555299420606 doohtml is still a bit faster, but way closer than with webdriver.

If I profile with chrome I can't see any difference between both. I chose a 4x CPU slowdown: vanillajs: Screenshot from 2022-04-02 13-09-56 doohtml

And I think that's how it should be. Both clear tbody.textContent. I changed vanillajs to clear textContent only and skip all other logic, but it didn't change the results.

krausest commented 2 years ago

This was nastier than I thought. I couldn't find the reason for the numbers reported by chromedriver and chrome's profiler showed a different picture.

Some time ago I prepared an alternative benchmark runner that uses puppeteer, but I didn't finalize the switch. But I'm planning to do it now. The biggest advantage is that it produces traces that can be loaded in the chrome profiler. This makes it much easier to check the results.

The first candidate can be seen here: puppeteer results

And here's an excerpt (left new puppeteer results, right old webdriver results): Screenshot from 2022-04-04 20-46-34

One large difference can be seen for doothml and create benchmark. Let's take a look at a trace from puppeteer: Screenshot from 2022-04-04 21-13-09 This trace shows (at least together with the others) that a median of 86 msecs appears to be plausible.

But there was another important change to get there. Some frameworks got worse after switching to puppeteer. The traces looked like that: Screenshot from 2022-04-04 21-19-50 Here we see that after performing the create operation a timer fires which causes a redraw. The old logic was to compute the duration from the start of the click event to the last paint event. And this would result in a value of 180 msecs for doohtml! And I guess a similar issue happened for the old results.

How can we measure the duration better? I propose the following: For all benchmarks except select row we compute the duration like that: from the start of the click event to the end of the first paint after the last layout event (usually there's just one layout event). Cases with multiple layout events happen e.g. for aurelia (replace rows): Screenshot from 2022-04-04 21-45-00 There's a first layout event after the click event and a second after the actual dom update happened. Or for ember (create rows): There's a first layout event right after the click before a timer event that appears to cause the real redraw.

For select row no layout event should happen. There are cases with multiple paint event e.g. for fre: Screenshot from 2022-04-04 22-06-24 The first paint event happens about 10 msecs after the click but the second one (after 417 msecs) is the correct paint event for our duration. Or glimmer: Or marko - which is interesting, since it happens only sometimes for marko:

There are even frameworks that happen to have a layout event (and three paint events) for select rows like reflex-dom: Screenshot from 2022-04-04 22-13-43

So for select rows I'm staying with the old logic. The duration is from the starting of the click event to the end of the last paint event. Please note that this leaves the issue open that we solved above for the other benchmarks, but I have no other idea how to measure the duration better if there's no layout event.

@hman61 Sorry, I'm afraid to say that it currently seems like I measured doohtml incorrectly! Do you have an idea what's causing the second paint in traces like these? Is it the test driver or are there any timers fired in doothml? I think I've never seen them in the chrome profile so it may be the test driver. Screenshot from 2022-04-04 23-10-40

krausest commented 2 years ago

I ported the logic back to the webdriver to get a better understanding of the differences between webdriver and puppeteer results.

A few questions arise when comparing puppeteer and webdriver.

Why is doohtml much faster for create rows with puppeteer?
Why is svelte much faster for replace rows with puppeteer?
What's the reason for most frameworks to be faster for replace rows with puppeteer (vanilla, solid, vue)?
Is puppeteer slower for select row for react?
Why are vanillajs and solid much faster for clear rows with pupeteer?

krausest commented 2 years ago

1.Why is doohtml much faster for create rows with puppeteer?

For svelte the new logic returns the same result for create rows, since there's just a single paint event.

For doohtml there's a difference:

more than one paint event found
paints event {
  type: 'paint',
  ts: 181465296692,
  dur: 629,
  end: 181465297321,
  evt: '{"method":"Tracing.dataCollected","params":{"args":{"data":{"clip":[-16777215,-16777215,16777216,-16777215,16777216,16777216,-16777215,16777216],"frame":"F318F3FB0BBBC36CD591D24FEF975201","layerId":0,"nodeId":212}},"cat":"devtools.timeline,rail","dur":629,"name":"Paint","ph":"X","pid":1938853,"tdur":622,"tid":1,"ts":181465296692,"tts":880680}}'
} 88.744
paints event {
  type: 'paint',
  ts: 181465333324,
  dur: 2633,
  end: 181465335957,
  evt: '{"method":"Tracing.dataCollected","params":{"args":{"data":{"clip":[-16777215,-16777215,16777216,-16777215,16777216,16777216,-16777215,16777216],"frame":"F318F3FB0BBBC36CD591D24FEF975201","layerId":0,"nodeId":212}},"cat":"devtools.timeline,rail","dur":2633,"name":"Paint","ph":"X","pid":1938853,"tdur":2622,"tid":1,"ts":181465333324,"tts":907592}}'
} 127.38
WARNING: New and old measurement code return different values. New  88.744 old 127.38

The mean for doohtml with the new logic is 86.45, i.e. comparable to the puppeteer result. Seems like the old idea "measure from click till last paint" didn't work right for this framework, since there's a two paint events.

ryansolid commented 2 years ago

Yeah it's interesting DooHTML does very well in the webdriver in areas you wouldn't expect. On average Svelte is a tiny bit better on puppeteer in relation to others but most things do stay in line so to speak other than those 2 (and React's slow select row).

As for 5. I think other than DooHTML which is a bit of an enigma, VanillaJS and SolidJS are the only 2 doing el.textContent = "" to clear. Everything stays in line on clear but I think the difference is more emphasized in puppeteer.

krausest commented 2 years ago

Why is svelte much faster for replace rows with puppeteer?

That's somehow different. The old and the new logic return the same result for svelte for replace rows since there's just a single paint event and that event takes about 124 msecs.

The puppeteer trace also looks completely unsuspicious: Screenshot from 2022-04-05 20-47-54

I have currently no idea where that large difference comes from.

When I perform replace rows manually and profile the 5th create rows I'm getting that profile in chrome: Screenshot from 2022-04-05 20-53-32 Those numbers look closer to the webdriver results.

So I came back to the last resort: Adding some client side hack for measuring the duration ( https://github.com/krausest/js-framework-benchmark/commit/89b546fe1b772e0f9f0b94da81b83513b6759e4b ). The result is [105, 99.30, 94.30, 97.60, 95.40, 115.10, 111.7, 102.10, 94.80, 87.10, 94.60, 97.9] for puppeteer and [115.80, 145,10, 132, 141.10, 145.30, 133.60, 132.9, 131.9, 138.40, 134.20, 119.70, 139.30] for webdriver. This confirms that the two benchmark drivers observe different performance. Running manually in chrome gives me numbers close to webdriver.

[Updated 4/9/22] I'm getting the fast results for a very small list of categories like 'devtools.timeline,disabled-by-default-devtools.timeline'. Those categories are enough to compute the duration from the timeline and show event names. If I enable full tracing and enable many categories (and it seem like disabled-by-default-devtools.timeline.stack is the one that causes the slow down) I'm getting much slower results, closer to webdriver-ts (but this is not an explanation since the webdriver benchmark driver itself also uses just the small category list). What's interesting to see what the client hack reveals:

BROWSER: JSHandle:run took 94.79999999701977
BROWSER: JSHandle:run took 96.70000000298023
BROWSER: JSHandle:run took 106.59999999403954
BROWSER: JSHandle:run took 84.90000000596046
BROWSER: JSHandle:run took 93.30000001192093
after initialized  02_replace1k 0 svelte
runBenchmark
BROWSER: JSHandle:run took 111.6000000089407

The replace row operation invokes create rows and then 5 warm up create rows (which replace the content) before the 6th create row is called. The duration of this 6th operation is counted. All 5 replace row invocations are all faster than the final one and I suppose this is due to chrome's profiler which is active only for that last call. With the small list of enabled categories the last run is in the same range as the others:

BROWSER: JSHandle:run took 95
BROWSER: JSHandle:run took 99.5
BROWSER: JSHandle:run took 93
BROWSER: JSHandle:run took 95.30000001192093
BROWSER: JSHandle:run took 99.29999999701977
after initialized  02_replace1k 0 svelte
runBenchmark
BROWSER: JSHandle:run took 97.20000000298023

It seems like the impact of full profiling is much less for the other frameworks. Here's vanillajs and there's hardly a difference for the final run:

BROWSER: JSHandle:run took 85.40000000596046
BROWSER: JSHandle:run took 87.79999999701977
BROWSER: JSHandle:run took 79.59999999403954
BROWSER: JSHandle:run took 96.79999999701977
BROWSER: JSHandle:run took 79.90000000596046
after initialized  02_replace1k 0 vanillajs
runBenchmark
BROWSER: JSHandle:run took 85.5

or react hooks:

BROWSER: JSHandle:RUN took 107.6000000089407
BROWSER: JSHandle:RUN took 133
BROWSER: JSHandle:RUN took 93.20000000298023
BROWSER: JSHandle:RUN took 96.20000000298023
BROWSER: JSHandle:RUN took 99.29999999701977
after initialized  02_replace1k 0 react-hooks
runBenchmark
BROWSER: JSHandle:RUN took 95.09999999403954

=> So if using puppeteer we should use the smallest list of enabled trace categories.

krausest commented 2 years ago

[Updated 4/9/22]

What's the reason for most frameworks to be faster for replace rows with puppeteer (vanilla, solid, vue)?

When I first looked into it I ran into the usual trap: I used my default chrome with extension (1password, ublock) enabled. I really should remember to always use incognito mode for all profiling!

Vanilllajs then gives me results that look close to the puppeteer results: Screenshot from 2022-04-09 10-09-18 Most replace rows tasks were all close to 83 msecs.

=> Seems like puppeteer results are ok.

krausest commented 2 years ago

[Updated 4/9/22] 4.Is puppeteer slower for select row for react?

It seems like this was also caused by enabling many tracing categories. With the minimal trace category list it looks pretty similar.

webdriver react-hooks
mean 54.8273 median 55.2395
mean 53.99490000000001 median 53.8345

puppeteer react-hooks
mean 52.32250000000001 median 52.4035 stddev 2.6654604459434195
mean 53.0296 median 52.593500000000006 stddev 2.7626347810257763

krausest commented 2 years ago

Why are vanillajs and solid much faster for clear rows with pupeteer?

As for clear rows there is just 1 paint event for vanillajs and solid, so the new logic can't help. The client side hack reports also numbers in the 70 msecs range.

krausest commented 2 years ago

I applied the new logic (duration = time span between start of click and last paint for select row benchmark, time span between start of click and first paint after a layout event for all benchmarks but select row) to the webdriver client.

Results differ for create 1k, create 10k, remove and select for a few framework.

First I took a look at select rows benchmark. Is the above logic (click until last paint) reasonable?

Most frameworks have only a single paint event and thus are fine.

More than one paint event have fre, glimmer, glimmer-2, maquette, marko, miso, react-recoil, reflex-dom

Let's take a closer look (CPU slowdown is 1 for the screenshots): Most frameworks report the first paint after about 1 msec and the second paint event much later. Like glimmer: (first zoomed in to show the first paint event) Screenshot from 2022-04-06 21-36-08 glimmer-2: For those frameworks it would be wrong to take the first paint! So the logic looks fine.

Maquette is different. The first paint is about 12 msecs after the click and the second about 18 msecs: Screenshot from 2022-04-06 21-40-42

And reflex-dom is different too. It has three paints. One screenshot is zoomed in to show the first paint. Screenshot from 2022-04-06 21-43-43 I also consider the logic to be correct for this case. The whole operation ends with the third paint event.

=> It seems fine to compute the duration for select row as the time span from the click event to the last paint event.

krausest commented 2 years ago

The second benchmark that differs is create rows. But create 1k rows and create 10k rows only differ for doohtml. And we've seen above that it was important for doohtml to use the first paint after the layout event. No other frameworks were affected for those benchmarks.

hman61 commented 2 years ago

Unfortunately I have not had any time to look into what potentially causes some of the variations you are seeing in DooHTML, although I do have a few high level observations that I like to share.

I think measuring the last paint still might not be the best measure to get a time measurement of a true user experience when simulating a user using the benchmark app. Reporting on true user experience measurements would be much more valuable than relying on any tools measurements IMO. For example, if you put some internal timers that collect the start and stop and store the times of each event in Javascript memory, then reporting can then simply be saved to local storage after the test runs are completed. This approach could be accomplished by leveraging a tool like Cypress.io transparently, where before each test event the keyed timers could be transparently injected to each button. This is something I currently do for my preliminary performance tests and in those tests I have also looked at what happens when you add a blur() event to each button that was pressed. When you do that, you will notice that there is a significant increase in time, because that successful blur() event is essentially a signal that the application is "settled" and a real user would be able to start interacting with the app again. I can send you some samples of these differences if you are interested, or you should be able to just do a rudimentary test on the run 10000 event with some simple dateTime() timers so that you can see the time difference with and without a blur().
This is minor, but in order to get less deviation in the rendering of the rows the rows should always have the same number of characters IMO "big red car" theoretically should take less to screen paint than: "handsome yellow sandwich" my suggestion is to make the length of all components of the data set have the equal amount of characters "bigxxxxx redxxx carxxxxx", The difference might still be so minute that such change might be unnecessary.
At some point I think there should be a benchmark that renders 3 instances of the template with 333 and 3333 rows instead of the current 1000 and 10000. I think we would be able to get a much more valuable metrix with this kind of scenario, because my hunch is when frameworks that are using Mutation Observers are "multithreading" over 3 observers at the same time (NOTE. the sum of rendered rows are still the same). So in that scenario, I expect that the results would be significantly slower compared to the single instance that we are currently testing. This would also be a much more realistic scenario for a SPA that renders multiple components. Ideally there should also be 3 different templates: Table Based, Simple Div/Span Based and a Flex grid, but I would of course start with 3 instances of the existing template.

If my measuring idea is feasible, then a lot of measuring complexity and unknowns could potentially be eliminated for the core time measurements and you can still of course continue to leverage any of the memory reports from the chosen measuring tools.

Just something to think about for the future.

Henrik Javen

On Tue, Apr 5, 2022 at 1:23 PM Stefan Krause @.***> wrote:

Why are vanillajs and solid much faster for clear rows with pupeteer?

As for clear rows there is just 1 paint event for vanillajs and solid, so the new logic can't help. The client side hack reports also numbers in the 70 msecs range.

— Reply to this email directly, view it on GitHub https://github.com/krausest/js-framework-benchmark/issues/1020#issuecomment-1089294581, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUCUQKK2C3ZPKRSS7PDOVLVDSOLRANCNFSM5SA2JHRQ . You are receiving this because you were mentioned.Message ID: @.***>

krausest commented 2 years ago

The third is remove row benchmark.

The following frameworks cause more than one paint: elm, endorphin, etch, hyperapp, imba, maquette, miso, reflex-dom, skruv, ui5-webcomponents, vidom, mithril, reagent.

Most of them look very similar, like endorphin Screenshot from 2022-04-06 22-10-52 imba

Mithril and reagent can look a bit different, like that: Screenshot from 2022-04-06 22-18-07 (there's sometimes a gap between the first and the second paint)

=> For remove rows we should continue with the old logic, i.e. duration to the last paint.

krausest commented 2 years ago

@hman61 I‘m not completely sure that I fully got the idea (I‘d have to add a client onblur-handler and measure to this event if I got the idea right). If you have some example code I‘d like to take a look at it. Would be interesting to see how it compares with the chrome profile. After taking a look at those many profiles I currently have the feeling that rendering duration varies between efficient and not so frameworks and thus should indeed be counted.

hman61 commented 2 years ago

Enclosed is a version vanillajs with some timers. In this version I'm logging in the tab title so you that you can easily see the render time for each click.

So when you run the 10,000 it appears that once the hits the blur() event, all rows have been drawn, although I can't verify if the last row has been drawn or not, but that can be easily verified by adding querySelector looking for the last row to see if it exists before the blur() event. (I will add that tomorrow and see there is a variation in the times.)

At the moment, it is not easy for me to compare because what is on your report has the 4x slow down and I don't know how to do that on my local machine, but if you try this version of Main.js you get the idea and can probably do a measurement comparison without the slow down.

Let me know if you got the attachment

-Henrik

On Wed, Apr 6, 2022 at 9:24 PM Stefan Krause @.***> wrote:

@hman61 https://github.com/hman61 I‘m not completely sure that I fully got the idea (I‘d have to add a client onblur-handler and measure to this event if I got the idea right). If you have some example code I‘d like to take a look at it. Would be interesting to see how it compares with the chrome profile. After taking a look at those many profiles I currently have the feeling that rendering duration varies between efficient and not so frameworks and thus should indeed be counted.

— Reply to this email directly, view it on GitHub https://github.com/krausest/js-framework-benchmark/issues/1020#issuecomment-1091065642, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUCUQLC2Y6EHXLLTXXO2KTVDZPPBANCNFSM5SA2JHRQ . You are receiving this because you were mentioned.Message ID: @.***>

'use strict';

function _random(max) { return Math.round(Math.random()*1000)%max; }

class Timer { static time = new Object()

static start(name) {

    if (!Timer.time[name]) {
        Timer.time[name] = [ new Date().getTime()]    
    }    
};
static stop(name) {
    if (Timer.time[name]) {
        Timer.time[name].push(new Date().getTime())
        if (name==='clear') {
            document.title = 'c:' + (Timer.time[name][1] - Timer.time[name][0])  + ' |' + document.title
            Timer.time[name] = undefined
        }

        if (name==='build') {
            document.title = 'b:' + (Timer.time[name][1] - Timer.time[name][0])  + ' ' + document.title
            Timer.time[name] = undefined
        }
        if (name==='tot') {
            document.title = '**|t:' + (Timer.time[name][1] - Timer.time[name][0])  + ' ' + document.title
            Timer.time[name] = undefined
        }
        if (name==='add') {
            document.title = 'a:' + (Timer.time[name][1] - Timer.time[name][0])  + ' ' + document.title
            Timer.time[name] = undefined
        } 
    }
};

}

const rowTemplate = document.createElement("tr"); rowTemplate.innerHTML = "";

class Store { constructor() { this.data = []; this.backup = null; this.selected = null; this.id = 1; } buildData(count = 1000) { Timer.start('build')

    var adjectives = ["pretty", "large", "big", "small", "tall", "short", "long", "handsome", "plain", "quaint", "clean", "elegant", "easy", "angry", "crazy", "helpful", "mushy", "odd", "unsightly", "adorable", "important", "inexpensive", "cheap", "expensive", "fancy"];
    var colours = ["red", "yellow", "blue", "green", "pink", "brown", "purple", "brown", "white", "black", "orange"];
    var nouns = ["table", "chair", "house", "bbq", "desk", "car", "pony", "cookie", "sandwich", "burger", "pizza", "mouse", "keyboard"];
    var data = [];
    for (var i = 0; i < count; i++)
        data.push({id: this.id++, label: adjectives[_random(adjectives.length)] + " " + colours[_random(colours.length)] + " " + nouns[_random(nouns.length)] });
        Timer.stop('build')

        return data;
}
updateData(mod = 10) {
    for (let i=0;i<this.data.length;i+=10) {
        this.data[i].label += ' !!!';
        // this.data[i] = Object.assign({}, this.data[i], {label: this.data[i].label +' !!!'});
    }
}
delete(id) {
    const idx = this.data.findIndex(d => d.id==id);
    this.data = this.data.filter((e,i) => i!=idx);
    return this;
}
run() {
    this.data = this.buildData();
    this.selected = null;
}
add() {
    this.data = this.data.concat(this.buildData(1000));
    this.selected = null;
}
update() {
    this.updateData();
    this.selected = null;
}
select(id) {
    this.selected = id;
}
hideAll() {
    this.backup = this.data;
    this.data = [];
    this.selected = null;
}
showAll() {
    this.data = this.backup;
    this.backup = null;
    this.selected = null;
}
runLots() {
    this.data = this.buildData(10000);
    this.selected = null;
}
clear() {
    this.data = [];
    this.selected = null;
}
swapRows() {
    if(this.data.length > 998) {
        var a = this.data[1];
        this.data[1] = this.data[998];
        this.data[998] = a;
    }
}

}

    document.getElementById("main").addEventListener('click', e => {
        //console.log("listener",e);
        if (e.target.matches('#add')) {
            e.preventDefault();
            //console.log("add");
            this.add();
        }
        else if (e.target.matches('#run')) {
            e.preventDefault();
            //console.log("run");
            this.run(e);
        }
        else if (e.target.matches('#update')) {
            e.preventDefault();
            //console.log("update");
            this.update();
        }
        else if (e.target.matches('#hideall')) {
            e.preventDefault();
            //console.log("hideAll");
            this.hideAll();
        }
        else if (e.target.matches('#showall')) {
            e.preventDefault();
            //console.log("showAll");
            this.showAll();
        }
        else if (e.target.matches('#runlots')) {
            e.preventDefault();
            //console.log("runLots");
            this.runLots(e);
        }
        else if (e.target.matches('#clear')) {
            e.preventDefault();
            //console.log("clear");
            this.clear(e);
        }
        else if (e.target.matches('#swaprows')) {
            e.preventDefault();
            //console.log("swapRows");
            this.swapRows();
        }
        else if (e.target.matches('.remove')) {
            e.preventDefault();
            let id = getParentId(e.target);
            let idx = this.findIdx(id);
            //console.log("delete",idx);
            this.delete(idx);
        }
        else if (e.target.matches('.lbl')) {
            e.preventDefault();
            let id = getParentId(e.target);
            let idx = this.findIdx(id);
            //console.log("select",idx);
            this.select(idx);
        }
    });
    this.tbody = document.getElementById("tbody");
}
findIdx(id) {
    for (let i=0;i<this.data.length;i++){
        if (this.data[i].id === id) return i;
    }
    return undefined;
}
run(e) {
    Timer.start('tot')
    this.removeAllRows();
    this.store.clear();
    this.rows = [];
    this.data = [];
    this.store.run();
    this.appendRows();
    this.unselect();
    e.target.blur()
    Timer.stop('tot')
}
add() {
    this.store.add();
    this.appendRows();
}
update() {
    this.store.update();
    for (let i=0;i<this.data.length;i+=10) {
        this.rows[i].childNodes[1].childNodes[0].innerText = this.store.data[i].label;
    }
}
unselect() {
    if (this.selectedRow !== undefined) {
        this.selectedRow.className = "";
        this.selectedRow = undefined;
    }
}
select(idx) {
    this.unselect();
    this.store.select(this.data[idx].id);
    this.selectedRow = this.rows[idx];
    this.selectedRow.className = "danger";
}
recreateSelection() {
    let old_selection = this.store.selected;
    let sel_idx = this.store.data.findIndex(d => d.id === old_selection);
    if (sel_idx >= 0) {
        this.store.select(this.data[sel_idx].id);
        this.selectedRow = this.rows[sel_idx];
        this.selectedRow.className = "danger";
    }
}
delete(idx) {
    // Remove that row from the DOM
    this.store.delete(this.data[idx].id);
    this.rows[idx].remove();
    this.rows.splice(idx, 1);
    this.data.splice(idx, 1);
    this.unselect();
    this.recreateSelection();
}
removeAllRows() {
    Timer.start('clear')
    // ~258 msecs
    // for(let i=this.rows.length-1;i>=0;i--) {
    //     tbody.removeChild(this.rows[i]);
    // }
    // ~251 msecs
    // for(let i=0;i<this.rows.length;i++) {
    //     tbody.removeChild(this.rows[i]);
    // }
    // ~216 msecs
    // var cNode = tbody.cloneNode(false);
    // tbody.parentNode.replaceChild(cNode ,tbody);
    // ~212 msecs
    this.tbody.textContent = "";
    Timer.stop('clear')

    // ~236 msecs
    // var rangeObj = new Range();
    // rangeObj.selectNodeContents(tbody);
    // rangeObj.deleteContents();
    // ~260 msecs
    // var last;
    // while (last = tbody.lastChild) tbody.removeChild(last);
}
runLots(e) {
    Timer.start('tot')

    this.removeAllRows();
    this.store.clear();
    this.rows = [];
    this.data = [];
    this.store.runLots();
    this.appendRows();
    this.unselect();
    e.target.blur()
    Timer.stop('tot')

}
clear() {
    this.store.clear();
    this.rows = [];
    this.data = [];
    // This is actually a bit faster, but close to cheating
    // requestAnimationFrame(() => {
        this.removeAllRows();
        this.unselect();
    // });
}
swapRows() {
    if (this.data.length>10) {
        Timer.start('tot')
        this.store.swapRows();
        this.data[1] = this.store.data[1];
        this.data[998] = this.store.data[998];

        this.tbody.insertBefore(this.rows[998], this.rows[2])
        this.tbody.insertBefore(this.rows[1], this.rows[999])

        let tmp = this.rows[998];
        this.rows[998] = this.rows[1];
        this.rows[1] = tmp;
        Timer.stop('tot')

    }

    // let old_selection = this.store.selected;
    // this.store.swapRows();
    // this.updateRows();
    // this.unselect();
    // if (old_selection>=0) {
    //     let idx = this.store.data.findIndex(d => d.id === old_selection);
    //     if (idx > 0) {
    //         this.store.select(this.data[idx].id);
    //         this.selectedRow = this.rows[idx];
    //         this.selectedRow.className = "danger";
    //     }
    // }
}
appendRows() {
    // Using a document fragment is slower...
    // var docfrag = document.createDocumentFragment();
    // for(let i=this.rows.length;i<this.store.data.length; i++) {
    //     let tr = this.createRow(this.store.data[i]);
    //     this.rows[i] = tr;
    //     this.data[i] = this.store.data[i];
    //     docfrag.appendChild(tr);
    // }
    // this.tbody.appendChild(docfrag);

    // ... than adding directly
    Timer.start('add')
    var rows = this.rows, s_data = this.store.data, data = this.data, tbody = this.tbody;
    for(let i=rows.length;i<s_data.length; i++) {
        let tr = this.createRow(s_data[i]);
        rows[i] = tr;
        data[i] = s_data[i];
        tbody.appendChild(tr);
    }
    Timer.stop('add')

}
createRow(data) {
    const tr = rowTemplate.cloneNode(true),
        td1 = tr.firstChild,
        a2 = td1.nextSibling.firstChild;
    tr.data_id = data.id;
    td1.textContent = data.id;
    a2.textContent = data.label;
    return tr;
}

}

new Main();

krausest commented 2 years ago

Here's my conclusion.

The old measuring logic discriminated doohtml for create 1k and 10k rows. For some reason there's a second late and probably unrelated paint event that caused too slow results for doohtml. No other frameworks were affected. The measurement logic of create rows has been adjusted.
Switching to puppeteer for the whole benchmark looks promising. The biggest advantage is that due to the trace files written it's very simple to justify our measurements and discuss potential bugs of this benchmark.

I'll run the benchmark with puppeteer (minimal trace categories) and we'll take a look at the results @hman61 Thanks - I'll take a look at the client side measurement hopefully soon.

hman61 commented 2 years ago

Submitted a PR, trying to fix the Puppeteer discrepancy.

https://github.com/krausest/js-framework-benchmark/pull/1026

On Sat, Apr 9, 2022 at 1:45 AM Stefan Krause @.***> wrote:

Here's my conclusion.

The old measuring logic discriminated doohtml for create 1k and 10k rows. For some reason there's a second late and probably unrelated paint event that caused too slow results for doohtml. No other frameworks were affected.

Switching to puppeteer for the whole benchmark looks promising. The biggest advantage is that due to the trace files written it's very simple to justify our measurements and discuss potential bugs of this benchmark.

I'll run the benchmark with puppeteer (minimal trace categories) and we'll take a look at the results @hman61 https://github.com/hman61 Thanks - I'll take a look at the client side measurement hopefully soon.

— Reply to this email directly, view it on GitHub https://github.com/krausest/js-framework-benchmark/issues/1020#issuecomment-1093820689, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUCUQKWTKXHUWHJTA74VBTVEE7S3ANCNFSM5SA2JHRQ . You are receiving this because you were mentioned.Message ID: @.***>

krausest commented 2 years ago

Another night, another run. Left new puppeteer, right chromedriver: Screenshot from 2022-04-10 09-46-09 All results are now from puppeteer traces. create 1/10k rows for doohtml looks fine and clear rows makes more sense.

krausest commented 2 years ago

One (hopefully) last finding. Some frameworks like ui5-webcomponents and vidom show bad results e.g. for swap rows with puppeteer. ui5-webcomponents yields between 55 msecs and 120 msecs for swap rows. It turns out that there's often a very long delay waiting for the RAF to fire: 4x slowdown: Screenshot from 2022-04-10 20-43-34 2x slowdown: 1x slowdown

It seems like the delay depends on the CPU slowdown.

Webdriver does not show this behaviour. Results for ui5-components are all in the 55 msecs range for webdriver.

Just to give an impression: Results for puppeteer: 117.377, 119.684, 116.923, 123.66, 54.985, 125.074, 116.245, 115.509, 117.372, 119.572, 119.117, 54.576

Results for chromedriver: 51.164, 51.94, 52.187,54.211, 58.01, 52.564, 54.324, 55.684, 53.655, 53.431, 58.224, 57.056

krausest commented 2 years ago

Some other frameworks are also a bit slower for puppeteer e.g. for remove rows. But this appears not to be due to too long RAF-delays. ui5-webcomponents looks correct with ~33 msecs: Screenshot from 2022-04-10 21-28-08 The RAF-delay is 6 msecs and thus ok.

krausest commented 2 years ago

I tried to find a workaround for the delayed animation frames with CPU throttling.

Among the many frameworks that use RAF I identified the following frameworks as having sometimes a long delay: angular-nozone bobril choo dojo elm endorphin hyperapp imba maquette mithril reagent skruv ui5-webcomponents vidom

I identify them with some frustratingly complex logic: one request animation frame within the click event, one fire animation frame at least 16 msecs later than the click and no layout event between the click and the fire animation frame. Here's such an example with a long delay. Screenshot from 2022-04-13 23-11-35 In this case the idle delay should not be counted.

But there are so many variations, like crui. In this case the duration should be measured from click to paint. Screenshot from 2022-04-13 23-10-14

I've little idea why aurelia is doing things like that: Screenshot from 2022-04-11 20-09-53

krausest commented 2 years ago

So here's the suggested final result for chrome 100: https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_puppeteer_raf.html

Any comments?

fabiospampinato commented 2 years ago

It looks alright to me, a couple of things that I can spot:

vanillajs is ~19% slower in "remove row" than vanillajs-1, and one has to scroll relatively far down the table to find similar results, that seems very unlikely to be representative? And it's outside the confidence interval.
xania has the fastest "create many rows", but at the same time is 16% slower than vanilla at "create rows", that seems unlikely also, in the table for Chrome 99 it is only 3% off.

Other than that I can't see any super strange thing like we saw with the initial run for chrome 100.

If possible it'd be useful to output more statistically significant results, they seem to vary ~significantly with different runs 🤔

hman61 commented 2 years ago

I'm also not 100% confident that the measurement in the headless instance accurately represents a "live" browser instance. So for these benchmarks to be valuable we need to have full confidence that the time the browser is ready to be interacted with after each test case is not any different in a headless vs. a non headless.

I also have noticed an unusually large deviation of what appears to be identical or almost identical implementation between various frameworks.

On Thu, Apr 14, 2022 at 11:53 AM Fabio Spampinato @.***> wrote:

It looks alright to me, a couple of things that I can spot:

vanillajs is ~19% slower in "remove row" than vanillajs-1, and one has to scroll relatively far down the table to find similar results, that seems very unlikely to be representative? And it's outside the confidence interval.

xania has the fastest "create many rows", but at the same time is 16% slower than vanilla at "create rows", that seems unlikely also, in the table for Chrome 99 it is only 3% off.

Other than that I can't see any super strange thing like we saw with the initial run for chrome 100.

If possible it'd be useful to output more statistically significant results, they seem to vary ~significantly with different runs 🤔

— Reply to this email directly, view it on GitHub https://github.com/krausest/js-framework-benchmark/issues/1020#issuecomment-1099527953, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUCUQL6BEVINFLCIFLWZKDVFBSRXANCNFSM5SA2JHRQ . You are receiving this because you were mentioned.Message ID: @.***>

krausest commented 2 years ago

Thanks. I did one more change. I subtracted the whole delay until the animation frame fires. This actually gives an advantage for longer delays. I changed that to subtract the delay - 16 msecs.

@fabiospampinato The best thing about switching to puppeteer is that I can answer such questions much better than for webdriver. I can't explain why a framework is slower, but I can show some chrome traces and "proof" that the reported value actually happened on my machine, which I'll do now :)

vanillajs reports for remove 1k the following values [22.979,24.419,24.572,24.673,24.721,24.744,24.777,25.048,25.45,25.936]. Here's the trace for the 24.777 run: Screenshot from 2022-04-14 21-23-40

...and for vanillajs-1 I measured [20.561,20.635,20.73,20.802,20.919,20.992,21.655,21.715,21.718,22.475] And here's the trace for the 20.992 run: Screenshot from 2022-04-14 21-25-05

xania is statistically significant slower than vanillajs for create 1k, but not for for create 10k.

duration for vanillajs and create 1k rows: [78.277,78.781,81.697,83.022,83.674,83.885,88.241,91.318,92.782,95.063] The trace for 78.277: Screenshot from 2022-04-14 21-28-10 And the trace for 92.782:

duration for xania and create 1k rows: [82.587,82.613,86.362,95.322,95.995,97.547,99.23,106.812,109.966,112.8] Here's one of the slower runs with 109.966: Screenshot from 2022-04-14 21-32-33

@fabiospampinato I'll re-run the whole benchmark tonight and then we compare those values to https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_puppeteer_raf.html. Please do not compare chrome 100 results with any other chrome 100 results above since I played with the measurement rules. Still variance for puppeteer is higher than for webdriver (but that doesn't help when webdriver does strange things for remove rows...).

@hman61 I agree, this is why I'm running those benchmarks in a live browser and not headless.

krausest commented 2 years ago

2nd run is available: https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_puppeteer_raf2.html Overall I'm not happy with the results. The statistically significant differences between vanillajs and xania are gone now. Mikado and fullweb-helpers are much faster for create 1k rows.

The standard error for most operations is worse than the older webdriver results.

I'm a bit afraid that puppeteer results will stay less stable than the old results. Just to make sure I didn't mix anything up else (there were many changes) I'll make another run with no other changes and then we'll see.

fabiospampinato commented 2 years ago

Yeah the results seem to vary quite a bit, now "remove row" in the vanillajs implementations is about the same, while in the previous run it seemed as if vanillajs was significantly slower than vanillajs-1 for some reason.

I don't know how long it takes to update the entire table, and increasing that amount of time is probably a pain in the ass, but maybe the table should be updated 3~5 times and then those runs could be averaged out to produce the final table? I don't know how else we could get more statistically significant results.

Or maybe individual tests could be ran 3\~5x as much and only the fastest 2\~4 results for each test could be averaged out.

krausest commented 2 years ago

The tests run about 8 1/2 hours. Here's the 3rd run and I carefully payed attention that I made no changes on my system (like kernel updates, chrome updates, graphics driver update, configuration changes and of course same source code): https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_puppeteer_raf3.html Results are imo pretty close to the second run, so maybe it's not as bad as I thought with puppeteer. Unless someone finds a larger difference I'd take that for the official results.

krausest commented 2 years ago

I published chrome 100 results: https://github.com/krausest/js-framework-benchmark/releases/tag/chrome100

krausest commented 2 years ago

For the sake of completeness some more notes:

I implemented four different benchmark drivers: webdriver: The one used for the other runs so far. Issue: Clear rows was (?) pretty different for chrome 100 puppeteer: The latest official results use the puppeteer driver. playwright: A new driver very similar to puppeteer. webdriver-CDP: A new webdriver implementation that uses a CDPSession to manually start tracing that allows to use the same timeline calculation as puppeteer and playwright, writes the traces to disk and runs the trace for each benchmark (webdriver has to use a single trace for the whole benchmark loop). Issue: Trace files are much bigger than playwright or puppeteer for unknown reasons (same tracing categories are chosen)

You can switch between those drivers for CPU benchmarks in benchmarkConfiguration.ts. Just use the benchmark class from the right import, recompile and run.

Sadly the results differ between the test drivers for unknown reasons. Looking at the chrome flags in chrome://version or the tracing categories I can't find a setting that explains the differences. Still the three new implementations have the great advantage that they write trace files that can be loaded in the chrome profiler if you don't trust the numbers. left webdriver, right webdriver with CDP left puppeteer, right playwright Biggest differences: Svelte and elm create rows

For the time being I'm using puppeteer by default now.

LifeIsStrange commented 2 years ago

Other test/automation frameworks would be selenium RC (without webdriver) and Cypress. It's great that you tested that many already and good to see that puppeteer and playwright performs better

krausest commented 2 years ago

I moved to playwright now that I have the option, since it causes very few errors (puppeteer sometimes doesn't record a paint event in the trace, maybe a longer wait would help. playwright doesn't show this issue). I notices that the variance is much lower if I use the default categories for tracing - so I went with them.

krausest commented 2 years ago

I'm closing this issue. Migration to OSX and playwright / puppeteer is done.

krausest / js-framework-benchmark

Chrome 100 results are ... interesting #1020