Only handle top-level expressions

Rich-Harris commented 8 years ago

This is more for discussion than anything – I couldn't spend long on it so no tests yet, sorry!

@stefanpenner has been doing some research on this topic and found that the lazy parse heuristic only applies at the top level. I wondered if we could exploit that here and avoid having to walk the entire AST. As a bonus, we'd save a few bytes on parens that would (in theory) have no effect.

To do so I swapped out walk-ast for estree-walker (full disclosure, it's one of mine) which makes it possible to skip subgraphs. We get a little speed bump just from doing that, because estree-walker doesn't mutate the AST in any way (i.e. adding the parentNode property), but the real boost is from doing less work. (It would also fix #3.)

Example results:

➜  optimize-js git:(master) time lib/bin.js benchmarks/ember.js > /dev/null
lib/bin.js benchmarks/ember.js > /dev/null  0.65s user 0.08s system 106% cpu 0.689 total
➜  optimize-js git:(master) git checkout speedup
Switched to branch 'speedup'
Your branch is up-to-date with 'origin/speedup'.
➜  optimize-js git:(speedup) time lib/bin.js benchmarks/ember.js > /dev/null
lib/bin.js benchmarks/ember.js > /dev/null  0.47s user 0.05s system 117% cpu 0.437 total

So optimisation is ~25% quicker in Ember's case with this PR.

I ran the benchmarks on Chrome, and the improvement went from 55.75% to 53.62%. So it looks like there's a slight drop in parse time but I don't know if it's significant. More investigation needed probably!

nolanlawson commented 8 years ago

Awesome, thanks! My hunch is that this PR is great, no preference for walk-ast or estree-walker on my part (side note: does it fix #3 ?). Only thing is:

we need to bench more, obviously. the benchmark itself has a margin of error (up to 5% or so), needs to be run a couple times to be sure (or maybe I need to up the median in the bench)
we need to bench in more than one browser, assumptions based on V8 do not necessarily apply to Chakra or SpiderMonkey or JSC

nolanlawson commented 8 years ago

So optimisation is ~25% quicker in Ember's case with this PR.

Just re-read this, you mean optimize-js itself is 25% faster? TBH I care less about the amount of time it saves for the developer than the time it saves for the user... If we lose ~2% perf then should we really only manipulate at the top level? Unless there's another example of a library that causes perverse perf problems with the deep wrapping (in which case it should be added to the bench btw :)).

stefanpenner commented 8 years ago

@stefanpenner has been doing some research on this topic and found that the lazy parse heuristic

The vernacular is nuanced, let me clear it up (or at-least to my current understanding)lazyParse in v8 land, is more or on-demand parse. So when v8 encounters, during execution, code it has not yet parsed (or maybe needs to parse again) i believe it is a lazyParse.

What I spoke to @Rich-Harris about last night, was specifically about v8's pre-parse (Although I may have mis-spoke) , which I believe to be top-level only concern. When first parsing a file, V8 appears to attempt to more quickly "skim" parts of the js file it believes to not be required for initial evaluation.

krisselden commented 8 years ago

@stefanpenner is this newer than what I did? Last time I looked, there were 2 kinds of lazy parsing, and yes, one of them is only at the toplevel, and pretty much just using leading paren but there is a laziness at deeper levels, it has more heuristics though but still uses leading paren as an initial hint, and is more about whether to compile the function.

stefanpenner commented 8 years ago

@krisselden nope nothing new.

stefanpenner commented 8 years ago

So LazyParse and preParse interact with each other. If a code segment does not have the heuristics which implies it is required immediately, it will be quickly skimmed via the preParse phase. Then later when it is actually encountered via execution it will be lazy parsed + lazy compiled.

I am most likely still getting the various names incorrect. But I believe what I described to be more or less accurate.

But we should likely pull in someone like @bmeurer who actually knows what he is talking about.

Rich-Harris commented 8 years ago

Just re-read this, you mean optimize-js itself is 25% faster?

Yep. Totally agree with your priorities – this PR was predicated on the heuristic being top-level only. If that's not the case then this.skip() shouldn't be used. (It'll still be a little bit quicker, just not as much.)

Re #3 – yeah, just checked, it fixes it. estree-walker doesn't know anything about node types, so it's resistant to changes in the language (e.g. async/await/whatever's in ES2019/20/21).

bmeurer commented 8 years ago

I'd suggest to ping @verwaest about this, he's clearly the expert in this particular V8 area currently.

krisselden commented 8 years ago

@Rich-Harris @nolanlawson The left paren is still a heuristic used for lazy compile but I'm not sure this is as slow as the lazy parse but could make a difference. https://cs.chromium.org/chromium/src/v8/src/parsing/parser-base.h?sq=package:chromium&rcl=1474312969&l=3138

I'm pretty sure that the biggest win is avoiding the fast function body skip at the top level for something eager (though this is the biggest win for a define() that isn't used).

krisselden commented 8 years ago

more detail in this comment https://cs.chromium.org/chromium/src/v8/src/parsing/parser.cc?q=eagercompile&sq=package:chromium&dr=C&l=2880

krisselden commented 8 years ago

To clarify more, the left paren is a hint to parse eager but it doesn't mean it will compile it eager.

The top level uses the preparser https://cs.chromium.org/chromium/src/v8/src/parsing/preparser.cc?q=LPAREN&sq=package:chromium&dr=C&l=336

nolanlawson commented 8 years ago

Hey folks, I'm a bit swamped this week due to TPAC, but just some quick thoughts:

Whether the top-level or mid-level is different or not in V8, I would like to bench it and see the improvement/regression for the scripts in the benchmark. If someone thinks the benchmark isn't representative, then let's add more scripts to the benchmark; I basically just grabbed libraries at random (for those wondering why PouchDB is in there ;))
As always, let's test multiple browsers. Haven't heard anything from Chakra folks suggesting that top-level is any different from elsewhere, but will follow-up. For those interested in how this looks in the Chakra stacktraces (and maybe wants to dig deeper), here's a high-level analysis I did.
My suspicion is that top-level may natural be a better optimization because inner-level is more likely to contain things we don't want to paren-wrap, such as addEventListener(), on(), etc.

verwaest-zz commented 8 years ago

We're working on fixing this performance difference and additionally making the top-level heuristic mostly unnecessary, and the nested heuristic completely unnecessary.

What's going on here? V8 has a preparser and a parser. The former is roughly 2x the speed of the latter. Unfortunately the preparser doesn't at this time support scope resolution, so we can only really use the preparser when parsing top-level functions. When we fully parse anything else then the top-level script, we need to fully parse nested functions. That's 2x slower than would be preparsing. Ouch. Additionally, we don't keep info around about functions we have ever parsed or preparsed. Hence whenever we parse a nested function, we have to fully parse deeper nested functions again. O(n) ouch!

We're working on fixing both issues, which will take a bit of time however.

Once we're done, we can skip over nested functions essentially for free, so an eager compile heuristic at that level makes little sense. No heuristic means we can't be wrong, yay. At the top-level we might still want it, but it's less relevant, especially since all nested functions will be skipped anyway.

I hope this sheds some light on the issues and what we're planning on doing about it.

nolanlawson / optimize-js

Only handle top-level expressions #6