Closed krausest closed 2 years ago
Looks like https://github.com/krausest/js-framework-benchmark/issues/166 Within a RAF call vanillajs clears in ~ 50 msecs.
Vanillajs can be improved significantly by wrapping create, replace and clear in RAF (left with RAF, right without). Not sure if I should commit this as it'll certainly spreads into benchmark implementations.
Personally I only see a bug in Chrome in here. Simply wrapping clearing in a raf is probably a bug even, now if you clear and append fast enough you won't see the appended rows. I don't see any reason to push frameworks toward this workaround for something that should probably be fixed in Chrome.
Yeah RAF has always been slightly faster and it has varied at different times. We made a mark for it because it clearly was gaming it. That being said there are a couple of frameworks that have lower scores on clear and I'm not sure in all cases it is just RAF. I looked at doohtml and didn't see an obvious RAF call.
Also worth noting is slight hit on replace rows. Most frameworks have taken a slight hit. sifr interestingly seems to be good on both these so I might try to see what the difference is.
Well, doohtml's implementation is particularly clear: https://github.com/krausest/js-framework-benchmark/blob/1e01af18b564bbd2df7386d40cea4d9dbe6db4c3/frameworks/keyed/doohtml/js/Main.class.js#L121 It's actually what vanillajs does https://github.com/krausest/js-framework-benchmark/blob/1e01af18b564bbd2df7386d40cea4d9dbe6db4c3/frameworks/keyed/vanillajs/src/Main.js#L228 Home come doohtml is 24 msecs faster? I'll try to profile this weekend...
I actually have doubts that the webdriver measurement is correct. First I removed the CPU slowdown. The webdriver results still show that doohtml is faster than vanillajs:
result vanillajs-keyed_09_clear1k_x8.json min 8.897 max 11.889 mean 10.4755 median 10.78 stddev 0.9645715514039265 result doohtml-keyed_09_clear1k_x8.json min 6.334 max 7.056 mean 6.6656 median 6.6165 stddev 0.3000385901105982
I prepared a branch for puppeteer. How do results look there (measurement isn't strictly comparable since it closes the browser after every benchmark loop): result vanillajs-keyed_09_clear1k_x8.json min 7.516 max 10.225 mean 8.3141 median 8.1485 stddev 0.84825709546104 result doohtml-keyed_09_clear1k_x8.json min 6.162 max 11.239 mean 7.912800000000002 median 6.6165 stddev 2.207555299420606 doohtml is still a bit faster, but way closer than with webdriver.
If I profile with chrome I can't see any difference between both. I chose a 4x CPU slowdown: vanillajs: doohtml
And I think that's how it should be. Both clear tbody.textContent. I changed vanillajs to clear textContent only and skip all other logic, but it didn't change the results.
This was nastier than I thought. I couldn't find the reason for the numbers reported by chromedriver and chrome's profiler showed a different picture.
Some time ago I prepared an alternative benchmark runner that uses puppeteer, but I didn't finalize the switch. But I'm planning to do it now. The biggest advantage is that it produces traces that can be loaded in the chrome profiler. This makes it much easier to check the results.
The first candidate can be seen here: puppeteer results
And here's an excerpt (left new puppeteer results, right old webdriver results):
One large difference can be seen for doothml and create benchmark. Let's take a look at a trace from puppeteer: This trace shows (at least together with the others) that a median of 86 msecs appears to be plausible.
But there was another important change to get there. Some frameworks got worse after switching to puppeteer. The traces looked like that: Here we see that after performing the create operation a timer fires which causes a redraw. The old logic was to compute the duration from the start of the click event to the last paint event. And this would result in a value of 180 msecs for doohtml! And I guess a similar issue happened for the old results.
How can we measure the duration better? I propose the following: For all benchmarks except select row we compute the duration like that: from the start of the click event to the end of the first paint after the last layout event (usually there's just one layout event). Cases with multiple layout events happen e.g. for aurelia (replace rows): There's a first layout event after the click event and a second after the actual dom update happened. Or for ember (create rows): There's a first layout event right after the click before a timer event that appears to cause the real redraw.
For select row no layout event should happen. There are cases with multiple paint event e.g. for fre: The first paint event happens about 10 msecs after the click but the second one (after 417 msecs) is the correct paint event for our duration. Or glimmer: Or marko - which is interesting, since it happens only sometimes for marko:
There are even frameworks that happen to have a layout event (and three paint events) for select rows like reflex-dom:
So for select rows I'm staying with the old logic. The duration is from the starting of the click event to the end of the last paint event. Please note that this leaves the issue open that we solved above for the other benchmarks, but I have no other idea how to measure the duration better if there's no layout event.
@hman61 Sorry, I'm afraid to say that it currently seems like I measured doohtml incorrectly! Do you have an idea what's causing the second paint in traces like these? Is it the test driver or are there any timers fired in doothml? I think I've never seen them in the chrome profile so it may be the test driver.
I ported the logic back to the webdriver to get a better understanding of the differences between webdriver and puppeteer results.
A few questions arise when comparing puppeteer and webdriver.
1.Why is doohtml much faster for create rows with puppeteer?
For svelte the new logic returns the same result for create rows, since there's just a single paint event.
For doohtml there's a difference:
more than one paint event found
paints event {
type: 'paint',
ts: 181465296692,
dur: 629,
end: 181465297321,
evt: '{"method":"Tracing.dataCollected","params":{"args":{"data":{"clip":[-16777215,-16777215,16777216,-16777215,16777216,16777216,-16777215,16777216],"frame":"F318F3FB0BBBC36CD591D24FEF975201","layerId":0,"nodeId":212}},"cat":"devtools.timeline,rail","dur":629,"name":"Paint","ph":"X","pid":1938853,"tdur":622,"tid":1,"ts":181465296692,"tts":880680}}'
} 88.744
paints event {
type: 'paint',
ts: 181465333324,
dur: 2633,
end: 181465335957,
evt: '{"method":"Tracing.dataCollected","params":{"args":{"data":{"clip":[-16777215,-16777215,16777216,-16777215,16777216,16777216,-16777215,16777216],"frame":"F318F3FB0BBBC36CD591D24FEF975201","layerId":0,"nodeId":212}},"cat":"devtools.timeline,rail","dur":2633,"name":"Paint","ph":"X","pid":1938853,"tdur":2622,"tid":1,"ts":181465333324,"tts":907592}}'
} 127.38
WARNING: New and old measurement code return different values. New 88.744 old 127.38
The mean for doohtml with the new logic is 86.45, i.e. comparable to the puppeteer result. Seems like the old idea "measure from click till last paint" didn't work right for this framework, since there's a two paint events.
Yeah it's interesting DooHTML does very well in the webdriver in areas you wouldn't expect. On average Svelte is a tiny bit better on puppeteer in relation to others but most things do stay in line so to speak other than those 2 (and React's slow select row).
As for 5. I think other than DooHTML which is a bit of an enigma, VanillaJS and SolidJS are the only 2 doing el.textContent = ""
to clear. Everything stays in line on clear but I think the difference is more emphasized in puppeteer.
That's somehow different. The old and the new logic return the same result for svelte for replace rows since there's just a single paint event and that event takes about 124 msecs.
The puppeteer trace also looks completely unsuspicious:
I have currently no idea where that large difference comes from.
When I perform replace rows manually and profile the 5th create rows I'm getting that profile in chrome: Those numbers look closer to the webdriver results.
So I came back to the last resort: Adding some client side hack for measuring the duration ( https://github.com/krausest/js-framework-benchmark/commit/89b546fe1b772e0f9f0b94da81b83513b6759e4b ). The result is [105, 99.30, 94.30, 97.60, 95.40, 115.10, 111.7, 102.10, 94.80, 87.10, 94.60, 97.9] for puppeteer and [115.80, 145,10, 132, 141.10, 145.30, 133.60, 132.9, 131.9, 138.40, 134.20, 119.70, 139.30] for webdriver. This confirms that the two benchmark drivers observe different performance. Running manually in chrome gives me numbers close to webdriver.
[Updated 4/9/22] I'm getting the fast results for a very small list of categories like 'devtools.timeline,disabled-by-default-devtools.timeline'. Those categories are enough to compute the duration from the timeline and show event names. If I enable full tracing and enable many categories (and it seem like disabled-by-default-devtools.timeline.stack is the one that causes the slow down) I'm getting much slower results, closer to webdriver-ts (but this is not an explanation since the webdriver benchmark driver itself also uses just the small category list). What's interesting to see what the client hack reveals:
BROWSER: JSHandle:run took 94.79999999701977
BROWSER: JSHandle:run took 96.70000000298023
BROWSER: JSHandle:run took 106.59999999403954
BROWSER: JSHandle:run took 84.90000000596046
BROWSER: JSHandle:run took 93.30000001192093
after initialized 02_replace1k 0 svelte
runBenchmark
BROWSER: JSHandle:run took 111.6000000089407
The replace row operation invokes create rows and then 5 warm up create rows (which replace the content) before the 6th create row is called. The duration of this 6th operation is counted. All 5 replace row invocations are all faster than the final one and I suppose this is due to chrome's profiler which is active only for that last call. With the small list of enabled categories the last run is in the same range as the others:
BROWSER: JSHandle:run took 95
BROWSER: JSHandle:run took 99.5
BROWSER: JSHandle:run took 93
BROWSER: JSHandle:run took 95.30000001192093
BROWSER: JSHandle:run took 99.29999999701977
after initialized 02_replace1k 0 svelte
runBenchmark
BROWSER: JSHandle:run took 97.20000000298023
It seems like the impact of full profiling is much less for the other frameworks. Here's vanillajs and there's hardly a difference for the final run:
BROWSER: JSHandle:run took 85.40000000596046
BROWSER: JSHandle:run took 87.79999999701977
BROWSER: JSHandle:run took 79.59999999403954
BROWSER: JSHandle:run took 96.79999999701977
BROWSER: JSHandle:run took 79.90000000596046
after initialized 02_replace1k 0 vanillajs
runBenchmark
BROWSER: JSHandle:run took 85.5
or react hooks:
BROWSER: JSHandle:RUN took 107.6000000089407
BROWSER: JSHandle:RUN took 133
BROWSER: JSHandle:RUN took 93.20000000298023
BROWSER: JSHandle:RUN took 96.20000000298023
BROWSER: JSHandle:RUN took 99.29999999701977
after initialized 02_replace1k 0 react-hooks
runBenchmark
BROWSER: JSHandle:RUN took 95.09999999403954
=> So if using puppeteer we should use the smallest list of enabled trace categories.
[Updated 4/9/22]
When I first looked into it I ran into the usual trap: I used my default chrome with extension (1password, ublock) enabled. I really should remember to always use incognito mode for all profiling!
Vanilllajs then gives me results that look close to the puppeteer results: Most replace rows tasks were all close to 83 msecs.
=> Seems like puppeteer results are ok.
[Updated 4/9/22] 4.Is puppeteer slower for select row for react?
It seems like this was also caused by enabling many tracing categories. With the minimal trace category list it looks pretty similar.
webdriver react-hooks
mean 54.8273 median 55.2395
mean 53.99490000000001 median 53.8345
puppeteer react-hooks
mean 52.32250000000001 median 52.4035 stddev 2.6654604459434195
mean 53.0296 median 52.593500000000006 stddev 2.7626347810257763
As for clear rows there is just 1 paint event for vanillajs and solid, so the new logic can't help. The client side hack reports also numbers in the 70 msecs range.
I applied the new logic (duration = time span between start of click and last paint for select row benchmark, time span between start of click and first paint after a layout event for all benchmarks but select row) to the webdriver client.
Results differ for create 1k, create 10k, remove and select for a few framework.
First I took a look at select rows benchmark. Is the above logic (click until last paint) reasonable?
Most frameworks have only a single paint event and thus are fine.
More than one paint event have fre, glimmer, glimmer-2, maquette, marko, miso, react-recoil, reflex-dom
Let's take a closer look (CPU slowdown is 1 for the screenshots): Most frameworks report the first paint after about 1 msec and the second paint event much later. Like glimmer: (first zoomed in to show the first paint event) glimmer-2: For those frameworks it would be wrong to take the first paint! So the logic looks fine.
Maquette is different. The first paint is about 12 msecs after the click and the second about 18 msecs:
And reflex-dom is different too. It has three paints. One screenshot is zoomed in to show the first paint. I also consider the logic to be correct for this case. The whole operation ends with the third paint event.
=> It seems fine to compute the duration for select row as the time span from the click event to the last paint event.
The second benchmark that differs is create rows. But create 1k rows and create 10k rows only differ for doohtml. And we've seen above that it was important for doohtml to use the first paint after the layout event. No other frameworks were affected for those benchmarks.
Unfortunately I have not had any time to look into what potentially causes some of the variations you are seeing in DooHTML, although I do have a few high level observations that I like to share.
I think measuring the last paint still might not be the best measure to get a time measurement of a true user experience when simulating a user using the benchmark app. Reporting on true user experience measurements would be much more valuable than relying on any tools measurements IMO. For example, if you put some internal timers that collect the start and stop and store the times of each event in Javascript memory, then reporting can then simply be saved to local storage after the test runs are completed. This approach could be accomplished by leveraging a tool like Cypress.io transparently, where before each test event the keyed timers could be transparently injected to each button. This is something I currently do for my preliminary performance tests and in those tests I have also looked at what happens when you add a blur() event to each button that was pressed. When you do that, you will notice that there is a significant increase in time, because that successful blur() event is essentially a signal that the application is "settled" and a real user would be able to start interacting with the app again. I can send you some samples of these differences if you are interested, or you should be able to just do a rudimentary test on the run 10000 event with some simple dateTime() timers so that you can see the time difference with and without a blur().
This is minor, but in order to get less deviation in the rendering of the rows the rows should always have the same number of characters IMO "big red car" theoretically should take less to screen paint than: "handsome yellow sandwich" my suggestion is to make the length of all components of the data set have the equal amount of characters "bigxxxxx redxxx carxxxxx", The difference might still be so minute that such change might be unnecessary.
At some point I think there should be a benchmark that renders 3 instances of the template with 333 and 3333 rows instead of the current 1000 and 10000. I think we would be able to get a much more valuable metrix with this kind of scenario, because my hunch is when frameworks that are using Mutation Observers are "multithreading" over 3 observers at the same time (NOTE. the sum of rendered rows are still the same). So in that scenario, I expect that the results would be significantly slower compared to the single instance that we are currently testing. This would also be a much more realistic scenario for a SPA that renders multiple components. Ideally there should also be 3 different templates: Table Based, Simple Div/Span Based and a Flex grid, but I would of course start with 3 instances of the existing template.
If my measuring idea is feasible, then a lot of measuring complexity and unknowns could potentially be eliminated for the core time measurements and you can still of course continue to leverage any of the memory reports from the chosen measuring tools.
Just something to think about for the future.
Henrik Javen
On Tue, Apr 5, 2022 at 1:23 PM Stefan Krause @.***> wrote:
- Why are vanillajs and solid much faster for clear rows with pupeteer?
As for clear rows there is just 1 paint event for vanillajs and solid, so the new logic can't help. The client side hack reports also numbers in the 70 msecs range.
— Reply to this email directly, view it on GitHub https://github.com/krausest/js-framework-benchmark/issues/1020#issuecomment-1089294581, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUCUQKK2C3ZPKRSS7PDOVLVDSOLRANCNFSM5SA2JHRQ . You are receiving this because you were mentioned.Message ID: @.***>
The third is remove row benchmark.
The following frameworks cause more than one paint: elm, endorphin, etch, hyperapp, imba, maquette, miso, reflex-dom, skruv, ui5-webcomponents, vidom, mithril, reagent.
Most of them look very similar, like endorphin imba
Mithril and reagent can look a bit different, like that: (there's sometimes a gap between the first and the second paint)
=> For remove rows we should continue with the old logic, i.e. duration to the last paint.
@hman61 I‘m not completely sure that I fully got the idea (I‘d have to add a client onblur-handler and measure to this event if I got the idea right). If you have some example code I‘d like to take a look at it. Would be interesting to see how it compares with the chrome profile. After taking a look at those many profiles I currently have the feeling that rendering duration varies between efficient and not so frameworks and thus should indeed be counted.
Enclosed is a version vanillajs with some timers. In this version I'm logging in the tab title so you that you can easily see the render time for each click.
So when you run the 10,000 it appears that once the hits the blur() event, all rows have been drawn, although I can't verify if the last row has been drawn or not, but that can be easily verified by adding querySelector looking for the last row to see if it exists before the blur() event. (I will add that tomorrow and see there is a variation in the times.)
At the moment, it is not easy for me to compare because what is on your report has the 4x slow down and I don't know how to do that on my local machine, but if you try this version of Main.js you get the idea and can probably do a measurement comparison without the slow down.
Let me know if you got the attachment
-Henrik
On Wed, Apr 6, 2022 at 9:24 PM Stefan Krause @.***> wrote:
@hman61 https://github.com/hman61 I‘m not completely sure that I fully got the idea (I‘d have to add a client onblur-handler and measure to this event if I got the idea right). If you have some example code I‘d like to take a look at it. Would be interesting to see how it compares with the chrome profile. After taking a look at those many profiles I currently have the feeling that rendering duration varies between efficient and not so frameworks and thus should indeed be counted.
— Reply to this email directly, view it on GitHub https://github.com/krausest/js-framework-benchmark/issues/1020#issuecomment-1091065642, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUCUQLC2Y6EHXLLTXXO2KTVDZPPBANCNFSM5SA2JHRQ . You are receiving this because you were mentioned.Message ID: @.***>
'use strict';
function _random(max) { return Math.round(Math.random()*1000)%max; }
class Timer { static time = new Object()
static start(name) {
if (!Timer.time[name]) {
Timer.time[name] = [ new Date().getTime()]
}
};
static stop(name) {
if (Timer.time[name]) {
Timer.time[name].push(new Date().getTime())
if (name==='clear') {
document.title = 'c:' + (Timer.time[name][1] - Timer.time[name][0]) + ' |' + document.title
Timer.time[name] = undefined
}
if (name==='build') {
document.title = 'b:' + (Timer.time[name][1] - Timer.time[name][0]) + ' ' + document.title
Timer.time[name] = undefined
}
if (name==='tot') {
document.title = '**|t:' + (Timer.time[name][1] - Timer.time[name][0]) + ' ' + document.title
Timer.time[name] = undefined
}
if (name==='add') {
document.title = 'a:' + (Timer.time[name][1] - Timer.time[name][0]) + ' ' + document.title
Timer.time[name] = undefined
}
}
};
}
const rowTemplate = document.createElement("tr"); rowTemplate.innerHTML = "
class Store { constructor() { this.data = []; this.backup = null; this.selected = null; this.id = 1; } buildData(count = 1000) { Timer.start('build')
var adjectives = ["pretty", "large", "big", "small", "tall", "short", "long", "handsome", "plain", "quaint", "clean", "elegant", "easy", "angry", "crazy", "helpful", "mushy", "odd", "unsightly", "adorable", "important", "inexpensive", "cheap", "expensive", "fancy"];
var colours = ["red", "yellow", "blue", "green", "pink", "brown", "purple", "brown", "white", "black", "orange"];
var nouns = ["table", "chair", "house", "bbq", "desk", "car", "pony", "cookie", "sandwich", "burger", "pizza", "mouse", "keyboard"];
var data = [];
for (var i = 0; i < count; i++)
data.push({id: this.id++, label: adjectives[_random(adjectives.length)] + " " + colours[_random(colours.length)] + " " + nouns[_random(nouns.length)] });
Timer.stop('build')
return data;
}
updateData(mod = 10) {
for (let i=0;i<this.data.length;i+=10) {
this.data[i].label += ' !!!';
// this.data[i] = Object.assign({}, this.data[i], {label: this.data[i].label +' !!!'});
}
}
delete(id) {
const idx = this.data.findIndex(d => d.id==id);
this.data = this.data.filter((e,i) => i!=idx);
return this;
}
run() {
this.data = this.buildData();
this.selected = null;
}
add() {
this.data = this.data.concat(this.buildData(1000));
this.selected = null;
}
update() {
this.updateData();
this.selected = null;
}
select(id) {
this.selected = id;
}
hideAll() {
this.backup = this.data;
this.data = [];
this.selected = null;
}
showAll() {
this.data = this.backup;
this.backup = null;
this.selected = null;
}
runLots() {
this.data = this.buildData(10000);
this.selected = null;
}
clear() {
this.data = [];
this.selected = null;
}
swapRows() {
if(this.data.length > 998) {
var a = this.data[1];
this.data[1] = this.data[998];
this.data[998] = a;
}
}
}
var getParentId = function(elem) { while (elem) { if (elem.tagName==="TR") { return elem.data_id; } elem = elem.parentNode; } return undefined; } class Main { constructor(props) { this.store = new Store(); this.select = this.select.bind(this); this.delete = this.delete.bind(this); this.add = this.add.bind(this); this.run = this.run.bind(this); this.update = this.update.bind(this); this.start = 0; this.rows = []; this.data = []; this.selectedRow = undefined;
document.getElementById("main").addEventListener('click', e => {
//console.log("listener",e);
if (e.target.matches('#add')) {
e.preventDefault();
//console.log("add");
this.add();
}
else if (e.target.matches('#run')) {
e.preventDefault();
//console.log("run");
this.run(e);
}
else if (e.target.matches('#update')) {
e.preventDefault();
//console.log("update");
this.update();
}
else if (e.target.matches('#hideall')) {
e.preventDefault();
//console.log("hideAll");
this.hideAll();
}
else if (e.target.matches('#showall')) {
e.preventDefault();
//console.log("showAll");
this.showAll();
}
else if (e.target.matches('#runlots')) {
e.preventDefault();
//console.log("runLots");
this.runLots(e);
}
else if (e.target.matches('#clear')) {
e.preventDefault();
//console.log("clear");
this.clear(e);
}
else if (e.target.matches('#swaprows')) {
e.preventDefault();
//console.log("swapRows");
this.swapRows();
}
else if (e.target.matches('.remove')) {
e.preventDefault();
let id = getParentId(e.target);
let idx = this.findIdx(id);
//console.log("delete",idx);
this.delete(idx);
}
else if (e.target.matches('.lbl')) {
e.preventDefault();
let id = getParentId(e.target);
let idx = this.findIdx(id);
//console.log("select",idx);
this.select(idx);
}
});
this.tbody = document.getElementById("tbody");
}
findIdx(id) {
for (let i=0;i<this.data.length;i++){
if (this.data[i].id === id) return i;
}
return undefined;
}
run(e) {
Timer.start('tot')
this.removeAllRows();
this.store.clear();
this.rows = [];
this.data = [];
this.store.run();
this.appendRows();
this.unselect();
e.target.blur()
Timer.stop('tot')
}
add() {
this.store.add();
this.appendRows();
}
update() {
this.store.update();
for (let i=0;i<this.data.length;i+=10) {
this.rows[i].childNodes[1].childNodes[0].innerText = this.store.data[i].label;
}
}
unselect() {
if (this.selectedRow !== undefined) {
this.selectedRow.className = "";
this.selectedRow = undefined;
}
}
select(idx) {
this.unselect();
this.store.select(this.data[idx].id);
this.selectedRow = this.rows[idx];
this.selectedRow.className = "danger";
}
recreateSelection() {
let old_selection = this.store.selected;
let sel_idx = this.store.data.findIndex(d => d.id === old_selection);
if (sel_idx >= 0) {
this.store.select(this.data[sel_idx].id);
this.selectedRow = this.rows[sel_idx];
this.selectedRow.className = "danger";
}
}
delete(idx) {
// Remove that row from the DOM
this.store.delete(this.data[idx].id);
this.rows[idx].remove();
this.rows.splice(idx, 1);
this.data.splice(idx, 1);
this.unselect();
this.recreateSelection();
}
removeAllRows() {
Timer.start('clear')
// ~258 msecs
// for(let i=this.rows.length-1;i>=0;i--) {
// tbody.removeChild(this.rows[i]);
// }
// ~251 msecs
// for(let i=0;i<this.rows.length;i++) {
// tbody.removeChild(this.rows[i]);
// }
// ~216 msecs
// var cNode = tbody.cloneNode(false);
// tbody.parentNode.replaceChild(cNode ,tbody);
// ~212 msecs
this.tbody.textContent = "";
Timer.stop('clear')
// ~236 msecs
// var rangeObj = new Range();
// rangeObj.selectNodeContents(tbody);
// rangeObj.deleteContents();
// ~260 msecs
// var last;
// while (last = tbody.lastChild) tbody.removeChild(last);
}
runLots(e) {
Timer.start('tot')
this.removeAllRows();
this.store.clear();
this.rows = [];
this.data = [];
this.store.runLots();
this.appendRows();
this.unselect();
e.target.blur()
Timer.stop('tot')
}
clear() {
this.store.clear();
this.rows = [];
this.data = [];
// This is actually a bit faster, but close to cheating
// requestAnimationFrame(() => {
this.removeAllRows();
this.unselect();
// });
}
swapRows() {
if (this.data.length>10) {
Timer.start('tot')
this.store.swapRows();
this.data[1] = this.store.data[1];
this.data[998] = this.store.data[998];
this.tbody.insertBefore(this.rows[998], this.rows[2])
this.tbody.insertBefore(this.rows[1], this.rows[999])
let tmp = this.rows[998];
this.rows[998] = this.rows[1];
this.rows[1] = tmp;
Timer.stop('tot')
}
// let old_selection = this.store.selected;
// this.store.swapRows();
// this.updateRows();
// this.unselect();
// if (old_selection>=0) {
// let idx = this.store.data.findIndex(d => d.id === old_selection);
// if (idx > 0) {
// this.store.select(this.data[idx].id);
// this.selectedRow = this.rows[idx];
// this.selectedRow.className = "danger";
// }
// }
}
appendRows() {
// Using a document fragment is slower...
// var docfrag = document.createDocumentFragment();
// for(let i=this.rows.length;i<this.store.data.length; i++) {
// let tr = this.createRow(this.store.data[i]);
// this.rows[i] = tr;
// this.data[i] = this.store.data[i];
// docfrag.appendChild(tr);
// }
// this.tbody.appendChild(docfrag);
// ... than adding directly
Timer.start('add')
var rows = this.rows, s_data = this.store.data, data = this.data, tbody = this.tbody;
for(let i=rows.length;i<s_data.length; i++) {
let tr = this.createRow(s_data[i]);
rows[i] = tr;
data[i] = s_data[i];
tbody.appendChild(tr);
}
Timer.stop('add')
}
createRow(data) {
const tr = rowTemplate.cloneNode(true),
td1 = tr.firstChild,
a2 = td1.nextSibling.firstChild;
tr.data_id = data.id;
td1.textContent = data.id;
a2.textContent = data.label;
return tr;
}
}
new Main();
Here's my conclusion.
I'll run the benchmark with puppeteer (minimal trace categories) and we'll take a look at the results @hman61 Thanks - I'll take a look at the client side measurement hopefully soon.
Submitted a PR, trying to fix the Puppeteer discrepancy.
https://github.com/krausest/js-framework-benchmark/pull/1026
On Sat, Apr 9, 2022 at 1:45 AM Stefan Krause @.***> wrote:
Here's my conclusion.
- The old measuring logic discriminated doohtml for create 1k and 10k rows. For some reason there's a second late and probably unrelated paint event that caused too slow results for doohtml. No other frameworks were affected.
- Switching to puppeteer for the whole benchmark looks promising. The biggest advantage is that due to the trace files written it's very simple to justify our measurements and discuss potential bugs of this benchmark.
I'll run the benchmark with puppeteer (minimal trace categories) and we'll take a look at the results @hman61 https://github.com/hman61 Thanks - I'll take a look at the client side measurement hopefully soon.
— Reply to this email directly, view it on GitHub https://github.com/krausest/js-framework-benchmark/issues/1020#issuecomment-1093820689, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUCUQKWTKXHUWHJTA74VBTVEE7S3ANCNFSM5SA2JHRQ . You are receiving this because you were mentioned.Message ID: @.***>
Another night, another run. Left new puppeteer, right chromedriver: All results are now from puppeteer traces. create 1/10k rows for doohtml looks fine and clear rows makes more sense.
One (hopefully) last finding. Some frameworks like ui5-webcomponents and vidom show bad results e.g. for swap rows with puppeteer. ui5-webcomponents yields between 55 msecs and 120 msecs for swap rows. It turns out that there's often a very long delay waiting for the RAF to fire: 4x slowdown: 2x slowdown: 1x slowdown
It seems like the delay depends on the CPU slowdown.
Webdriver does not show this behaviour. Results for ui5-components are all in the 55 msecs range for webdriver.
Just to give an impression: Results for puppeteer: 117.377, 119.684, 116.923, 123.66, 54.985, 125.074, 116.245, 115.509, 117.372, 119.572, 119.117, 54.576
Results for chromedriver: 51.164, 51.94, 52.187,54.211, 58.01, 52.564, 54.324, 55.684, 53.655, 53.431, 58.224, 57.056
Some other frameworks are also a bit slower for puppeteer e.g. for remove rows. But this appears not to be due to too long RAF-delays. ui5-webcomponents looks correct with ~33 msecs: The RAF-delay is 6 msecs and thus ok.
I tried to find a workaround for the delayed animation frames with CPU throttling.
Among the many frameworks that use RAF I identified the following frameworks as having sometimes a long delay: angular-nozone bobril choo dojo elm endorphin hyperapp imba maquette mithril reagent skruv ui5-webcomponents vidom
I identify them with some frustratingly complex logic: one request animation frame within the click event, one fire animation frame at least 16 msecs later than the click and no layout event between the click and the fire animation frame. Here's such an example with a long delay. In this case the idle delay should not be counted.
But there are so many variations, like crui. In this case the duration should be measured from click to paint.
I've little idea why aurelia is doing things like that:
So here's the suggested final result for chrome 100: https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_puppeteer_raf.html
Any comments?
It looks alright to me, a couple of things that I can spot:
Other than that I can't see any super strange thing like we saw with the initial run for chrome 100.
If possible it'd be useful to output more statistically significant results, they seem to vary ~significantly with different runs 🤔
I'm also not 100% confident that the measurement in the headless instance accurately represents a "live" browser instance. So for these benchmarks to be valuable we need to have full confidence that the time the browser is ready to be interacted with after each test case is not any different in a headless vs. a non headless.
I also have noticed an unusually large deviation of what appears to be identical or almost identical implementation between various frameworks.
On Thu, Apr 14, 2022 at 11:53 AM Fabio Spampinato @.***> wrote:
It looks alright to me, a couple of things that I can spot:
- vanillajs is ~19% slower in "remove row" than vanillajs-1, and one has to scroll relatively far down the table to find similar results, that seems very unlikely to be representative? And it's outside the confidence interval.
- xania has the fastest "create many rows", but at the same time is 16% slower than vanilla at "create rows", that seems unlikely also, in the table for Chrome 99 it is only 3% off.
Other than that I can't see any super strange thing like we saw with the initial run for chrome 100.
If possible it'd be useful to output more statistically significant results, they seem to vary ~significantly with different runs 🤔
— Reply to this email directly, view it on GitHub https://github.com/krausest/js-framework-benchmark/issues/1020#issuecomment-1099527953, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUCUQL6BEVINFLCIFLWZKDVFBSRXANCNFSM5SA2JHRQ . You are receiving this because you were mentioned.Message ID: @.***>
Thanks. I did one more change. I subtracted the whole delay until the animation frame fires. This actually gives an advantage for longer delays. I changed that to subtract the delay - 16 msecs.
@fabiospampinato The best thing about switching to puppeteer is that I can answer such questions much better than for webdriver. I can't explain why a framework is slower, but I can show some chrome traces and "proof" that the reported value actually happened on my machine, which I'll do now :)
vanillajs reports for remove 1k the following values [22.979,24.419,24.572,24.673,24.721,24.744,24.777,25.048,25.45,25.936]. Here's the trace for the 24.777 run:
...and for vanillajs-1 I measured [20.561,20.635,20.73,20.802,20.919,20.992,21.655,21.715,21.718,22.475] And here's the trace for the 20.992 run:
xania is statistically significant slower than vanillajs for create 1k, but not for for create 10k.
duration for vanillajs and create 1k rows: [78.277,78.781,81.697,83.022,83.674,83.885,88.241,91.318,92.782,95.063] The trace for 78.277: And the trace for 92.782:
duration for xania and create 1k rows: [82.587,82.613,86.362,95.322,95.995,97.547,99.23,106.812,109.966,112.8] Here's one of the slower runs with 109.966:
@fabiospampinato I'll re-run the whole benchmark tonight and then we compare those values to https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_puppeteer_raf.html. Please do not compare chrome 100 results with any other chrome 100 results above since I played with the measurement rules. Still variance for puppeteer is higher than for webdriver (but that doesn't help when webdriver does strange things for remove rows...).
@hman61 I agree, this is why I'm running those benchmarks in a live browser and not headless.
2nd run is available: https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_puppeteer_raf2.html Overall I'm not happy with the results. The statistically significant differences between vanillajs and xania are gone now. Mikado and fullweb-helpers are much faster for create 1k rows.
The standard error for most operations is worse than the older webdriver results.
I'm a bit afraid that puppeteer results will stay less stable than the old results. Just to make sure I didn't mix anything up else (there were many changes) I'll make another run with no other changes and then we'll see.
Yeah the results seem to vary quite a bit, now "remove row" in the vanillajs implementations is about the same, while in the previous run it seemed as if vanillajs was significantly slower than vanillajs-1 for some reason.
I don't know how long it takes to update the entire table, and increasing that amount of time is probably a pain in the ass, but maybe the table should be updated 3~5 times and then those runs could be averaged out to produce the final table? I don't know how else we could get more statistically significant results.
Or maybe individual tests could be ran 3\~5x as much and only the fastest 2\~4 results for each test could be averaged out.
The tests run about 8 1/2 hours. Here's the 3rd run and I carefully payed attention that I made no changes on my system (like kernel updates, chrome updates, graphics driver update, configuration changes and of course same source code): https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_puppeteer_raf3.html Results are imo pretty close to the second run, so maybe it's not as bad as I thought with puppeteer. Unless someone finds a larger difference I'd take that for the official results.
I published chrome 100 results: https://github.com/krausest/js-framework-benchmark/releases/tag/chrome100
For the sake of completeness some more notes:
You can switch between those drivers for CPU benchmarks in benchmarkConfiguration.ts. Just use the benchmark class from the right import, recompile and run.
For the time being I'm using puppeteer by default now.
Other test/automation frameworks would be selenium RC (without webdriver) and Cypress. It's great that you tested that many already and good to see that puppeteer and playwright performs better
I moved to playwright now that I have the option, since it causes very few errors (puppeteer sometimes doesn't record a paint event in the trace, maybe a longer wait would help. playwright doesn't show this issue). I notices that the variance is much lower if I use the default categories for tracing - so I went with them.
I'm closing this issue. Migration to OSX and playwright / puppeteer is done.
Results for Chrome 100 look pretty different. The fastest strategy for clear rows has changed and results for create rows are significantly slower. I think I have to take a closer look before publishing those results as official. If anyone finds the trick for clear rows please post it here ;-)
https://krausest.github.io/js-framework-benchmark/2022/table_chrome_100_preliminary.html