Can spec stylesheets be faster?

domenic commented 6 years ago

As everyone knows, the single-page HTML spec at https://html.spec.whatwg.org/ is fairly slow to load. Safari does pretty well IIRC, but others not so much.

However, the new review draft published at https://html.spec.whatwg.org/review-drafts/2018-07/ is quite fast. The only real difference between them is in the stylesheets.

I was talking to someone at TC39, and they mentioned that in the past the spec stylesheet for HTML used bad practices that caused O(n^2) behavior, and @bzbarsky and Hixie went back and forth on this for a bit. It was unclear whether Hixie applied all of the recommendations, or only some, or none. It's certainly the case that these stylesheets are quite old, and grew organically, so it's easy to believe they might have some bad practices in them.

So I wanted to open this issue to ask folks familiarly with CSS engine performance if there are things we can edit in the stylesheets to make things faster:

spec.css: common baseline, also used by review drafts
standard.css: where most styles live
HTML standard inline style

Thoughts appreciated!

bzbarsky commented 6 years ago

The only real difference between them is in the stylesheets.

Not quite. The non-draft version has a more nodes on it (478k vs 350k). The actual text for the draft is also about 70KB less (is this all stylesheet bits?).

I did poke around quickly and I'm not sure what those extra 130k nodes are about . The draft seems to have things like support tables and bug annotations and whatnot... It might be worth figuring out what the difference there is

For the main question, I'll try to do some profiling and see whether CSS/layout is still popping up.

The node counts are heavily influenced by

domenic commented 6 years ago

Ah, fair point. A lot of that these days is syntax highlighting, although that was only introduced in the last couple of weeks.

Edit: https://html.spec.whatwg.org/commit-snapshots/21b8363138cb0ec3a0ce9a43850f52fd07ea3fb5/ is a version before syntax highlighting, for reference. Element counts are similar (155 108 review draft vs. 155 162 commit snapshot); didn't code up a node counter to check that.

bzbarsky commented 6 years ago

Yes, that commit snapshot pretty much matches the review draft in node count, thank you.

I just tested, and the commit snapshot and the review draft both load in about 3-4 seconds for me in Firefox when coming more or less from cache. Same for Safari. Chrome is at about 13 seconds for the commit snapshot and 7s for the review draft.

Uncached loads of the review draft are about 8s for Chrome, about 5s for Firefox and Safari.

Uncached loads of the commit snapshot are about 13s for Chrome, about 7s for Firefox and Safari.

bzbarsky commented 6 years ago

Oh, and all those timings are pretty noisy, with an ~1-1.5s error bar.

bzbarsky commented 6 years ago

So looking at the sheets linked above,

The [hidden] rule in the inline style is only needed for "down-rev" UAs, right? It's a bit slow but probably not huge in the grand scheme of things.
There are various uses of + and descendant combinators in the inline style, but all seem to be scoped under a class selector, so presumably UAs can use bloom filter optimizations or equivalent to optimize those out.
In standard.css there are a bunch of + that are not scoped (like h1 + h2, p + * etc). Those are liable to cause a bunch of restyling. Specifically when dynamic insertions happen it's possible that UAs will restyle everything after the newly-inserted element. This only matters if the spec does scripted insertions (which it used to, but maybe not anymore?).
The c-[whatever] selectors are a bit annoying. There are 50k+ elements named "c-" on the page, and they will all have to have slow things happen with them for every one of these selectors. I wish this were using classes, not attribute names.

Nothing else is jumping out at me as particularly bad right now.

slightlyoff commented 6 years ago

Attaching a Chrome trace from a local build on a fast-ish laptop. trace_mac_laptop_chrome_tot_html.spec.whatwg.org.json.gz

dbaron commented 6 years ago

So after a quick profile in Gecko one thing that seems avoidable is the use of :first-line styles (the restyling involved accounts for 31% of reflow time, at least for the single largest reflow). (Another 34% of that reflow is text run construction; I can't think of much to do to avoid that. Having less text doesn't seem like an option.)

dbaron commented 6 years ago

Removing the single rule with :first-line from standard.css improves the time reported by putting this in the console:

document.body.style.display="none";document.documentElement.offsetTop;d = new Date();document.body.style.display="";document.documentElement.offsetTop;new Date() - d

by about 45% in Firefox on my machine, and in Chrome by about 10%.

Raw data:

Chrome, before removal: 5604 5319 5297
Chrome, after removal: 4782 4964 4888
Firefox, before removal: 9908 9519 9473
Firefox, after removal: 5110 5317 5395

annevk commented 6 years ago

Ah, that rule is obsolete anyway due to syntax highlighting.

Malvoz commented 5 years ago

Also preload resources?

domenic commented 5 years ago

What advantage do you anticipate rel=preload having here over rel=stylesheet?

Malvoz commented 5 years ago

I was under the impression that fetches would initiate earlier (even styles that aren't delayed due to a blocking script), especially using Link: rel=preload. But there are other resources than styles this could apply to. Also, resource hints such as preconnect could be considered. I'm sorry, perhaps I should've opened another issue?

domenic commented 5 years ago

I don't think changing rel=stylesheet to rel=preload + rel=stylesheet (or rel=preconnect rel=preload rel=stylesheet) makes things any faster. If we moved to using Link: headers instead of link elements, that could be faster, but in that case we should just use Link:rel=stylesheet, avoiding preload.

Malvoz commented 5 years ago

preconnect would be for resource-fetching from https://resources.whatwg.org in this case. But I take your point, these will certainly not lead to any substantial improvements.

tabatkins commented 3 months ago

The c-[whatever] selectors are a bit annoying. There are 50k+ elements named "c-" on the page, and they will all have to have slow things happen with them for every one of these selectors. I wish this were using classes, not attribute names.

Anyone know to what extent this is still an issue? The reason I'm shipping the syntax highlighting markup I am today is solely to minimize the over-the-wire syntax weight; <c- k> is smaller than <c- class="k">. But that's a pretty minor concern, and it would be perfectly fine to generate these as classes instead if they're causing styling perf issues.

whatwg / whatwg.org

Can spec stylesheets be faster? #220