sebmck commented 9 years ago

Babel could be much faster! Here are some possible areas to look into:

Check list:

[ ] Code generator
[ ] Grouping more transformers
[ ] Optimising existing transformers
- [x] es6.tailCall
- [ ] es6.blockScoping
- [ ] es6.objectSuper
- [ ] es6.classes
- [x] es6.constants
- [x] _shadowFunctions
[x] Optimising scope tracking
[x] Attach comments in parser
[x] Include comments in token stream
[x] Address regenerator concerns
Code generator

While this isn't a massive issue, and usually doesn't impact most projects, the code generators performance is pretty poor on large outputs. This is partly due to the copious amounts of lookbehinds and attempts to make the code as "pretty" as possible. There is a lot of room for improvement and this is an area where micro-optimisations pay off at a huge scale. Relevant folder is here.

Grouping more transformers

1472 bought the ability to merge multiple transformers into the same AST traversal. This impacts the way internal transformers are written but after a bit of getting used to I think I actually prefer this way. Regardless, transformers are now split into 6 groups. The internal transformers can be viewed here.

Reducing these groups may be complicated due to the various concerns that each one may have. ie. es3.memberExpressionLiterals needs to be ran on the entire tree, it needs to visit every single MemberExpression even if they've been modified or dynamically inserted.

Optimising existing transformers

Some transformers spawn "subtraversals". This is problematic as this negates the intention of minimising traversals signifcantly. For example, previously the es6.constants transformer would visit every single node that has it's "own" scope. It then spawns another "subtraversal" that checks all the child nodes for reassignments. This means that there are a lot of unnecessary visiting. Instead, with a75af0a5d20bdba5b93e3bba10529f0bd982810a, the transformer traversal is used and a scope binding lookup is done.

This technique (not that ingenius since the previous way was crap) could be used on the es6.blockScoping and _shadowFunctions (this does the arrow function this and arguments aliasing) transformers.

Optimising scope tracking

Similar to optimising existing transformers, the current scope tracking does multiple passes and has room for optimisation. This could all be done in a single pass and then when hitting a binding it could look up the tree for the current scope to attach itself to.

Attach comments in parser

Currently estraverse is used to attach comments. This isn't great, it means that an entire traversal is required in order to attach comments. This can be moved to the parser, similar to espree that's used in ESLint.

Include comments in token stream

Currently tokens and comments are concatenated together and sorted. This is so newlines can be retained between nodes. This is relatively inefficient since you're sorting a large array of possibly millions of elements.

Address regenerator

Regenerator uses ast-types, this means it has it's own set of scope tracking and traversal logic. It's slower than Babel's and is actually relatively heavy, especially on large node trees. There has been a lot of controversy about merging it into Babel and iterating on it from there. Nonetheless, it's a soloution that needs to be considered if all other avenues are unavailable.

I welcome contributions of any kind so any help is extremely appreciated!

cc @amasad @DmitrySoshnikov @gaearon @stefanpenner @babel/contributors

mrjoelkemp commented 9 years ago

Out of curiosity, how did you conclude that these described parts were the bottlenecks? Just runtime complexity analysis? Or was a profiling tool used?

sebmck commented 9 years ago

@mrjoelkemp

Basically been a combo of:

Profiling the compilation time of various large ES6 codebases (Ember, Traceur etc)
Benchmarking against competing tools/libraries. eg. comparing internal Babel code generator to escodegen etc.
Complexity analysis. I know certain patterns are slow and when they're repeated, even more so.

Also you can set the DEBUG environment variable to babel and get a bunch of debugging output about how long parsing, transforming and generating takes which is where I've noticed some of the hot points.

$ DEBUG=babel babel script.js

amasad commented 9 years ago

It might be worthwhile to check-in the benchmark program to run after major features to make sure we're not regressing. And to also have something to point to when doing performance work. I don't have experience with automated benchmarks. Maybe someone from @lodash or other perf-focused projects can help.

sebmck commented 9 years ago

Any advice/references @jdalton? Performance regression tests would actually be amazing.

monsanto commented 9 years ago

For me the biggest problem is Babel's startup time. Perhaps this isn't a problem for people who use JavaScript build tools like Gulp, but for the rest of us, it is. babel --version takes like 530ms for me--for comparison, it used to be like <200ms for 6to5.

You can shave off about 100-150ms (don't remember, sorry) by using browserify --bare --ignore-missing to bundle all of the dependencies into a single file. Don't know where the rest is coming from.

sebmck commented 9 years ago

Are you using an npm release or a locally linked copy? npm releases are going to be far quicker since the internal templates will be precompiled.

monsanto commented 9 years ago

@sebmck npm. Here's babel's startup time over versions (run several times):

time /usr/local/bin/iojs node_modules/.bin/6to5 --version
1.15.0
/usr/local/bin/iojs node_modules/.bin/6to5 --version  0.22s user 0.01s system 104% cpu 0.217 total

time /usr/local/bin/iojs node_modules/.bin/6to5 --version
2.13.7
/usr/local/bin/iojs node_modules/.bin/6to5 --version  0.28s user 0.02s system 104% cpu 0.284 total

time /usr/local/bin/iojs node_modules/.bin/6to5 --version
3.6.5
/usr/local/bin/iojs node_modules/.bin/6to5 --version  0.34s user 0.02s system 106% cpu 0.333 total

time /usr/local/bin/iojs node_modules/.bin/babel --version
4.7.16
/usr/local/bin/iojs node_modules/.bin/babel --version  0.36s user 0.04s system 106% cpu 0.373 total

time /usr/local/bin/iojs node_modules/.bin/babel --version
5.2.17
/usr/local/bin/iojs node_modules/.bin/babel --version  0.43s user 0.04s system 105% cpu 0.440 total

The 530ms figure is for a virtual machine, not sure why it's slower, but whatever. Anyway--we've doubled our startup time since the early days. Here's the time bundled:

time /usr/local/bin/iojs ./babel-bundle.js --version
5.2.17
/usr/local/bin/iojs ./babel-bundle.js --version  0.30s user 0.04s system 102% cpu 0.326 total

monsanto commented 9 years ago

One idea I've had is making a persistent Babel "server" and writing a CLI application with the same interface that contacts the server. Similar to Nailgun for Java. Would also let us take advantage of the JIT since any optimization is thrown away each invocation currently.

sebmck commented 9 years ago

Probably has largely to do with just the additional lines of code/dependencies and that a lot of stuff is done at runtime (visitor normalisation etc).

One idea I've had is making a persistent Babel "server" and writing a CLI application with the same interface that contacts the server.

I believe @thejameskyle had some thoughts on this. It's something TypeScript does and it gives IDEs access to internal TypeScript information to improve integration. Not sure how relevant it'd be for Babel but it's possibly something worth looking into.

amasad commented 9 years ago

We already do that in the React Packager (currently hidden in React Native, but we have plans on releasing it as a standalone thing). And I think webpack does it for you when you use the dev server? If not I think it might webpack might be a good place to add that.

monsanto commented 9 years ago

@amasad Yeah, plenty of JS build tools do this already. It would be nice of our CLI application did too for those of us who don't use a JS tool as the top-level driver in the build. Maybe this isn't so common in the JS world so I am the only person who has this problem?

sebmck commented 9 years ago

So after I merged #1472, compiling the Ember source went from 50s to 44s and Traceur went from 30s to 24s.

After commit f657598c72f5296895d2282b6bb4bd36713a7d42, Ember now compiles in 35s and Traceur in 19s. It was relatively small and had a huge improvement, hopefully there are places where more of the same optimisations can be done. Any help in finding them is much appreciated!

megawac commented 9 years ago

Performance regression tests would actually be amazing.

A simple way to do it is to have babel registered as a npm dep. Then just run benchmarks on master and the previous version and throw errors when the speed is x% slower. This can be done pretty easily with benchmark.js.

If anyone wants to wire up some tests under test/benchmark using benchmark.js, I'd be happy to modify them up to test against previous/multiple versions

sebmck commented 9 years ago

@megawac The issue is that simple benchmarks are pretty useless. It depends completely on feature use and AST structure so getting realistic and practical benchmarks is extremely difficult.

megawac commented 9 years ago

The issue is that simple benchmarks are pretty useless.

Sure but it makes it simplifies the process of detecting regression in certain parts of operations such as parsing, transformations, regeneration, etc. Further it makes retesting the perf effects of a changeset easier than profiling manually. OFC changing how any operations works may change performance, just makes it easier to determine by how much

sebmck commented 9 years ago

Large codebases are the only real productive type of benchmarking that you can do (at least that I've found, happy to be proven wrong). Even though the perf work I've done has increased performance by ~40% I've noticed no change in the time it takes to run the Babel tests for example.

phpnode commented 9 years ago

The benchmark could simply compare the compile time of babel, regenerator, bluebird and lodash (to pick 3 largish random examples that are already dependencies) with the current master as @megawac said. It doesn't matter that the other 3 libraries are not using es6 features because babel still has to do almost the same amount of work regardless, and babel uses enough babel features itself to make the measurements meaningful.

sebmck commented 9 years ago

@phpnode

It doesn't matter that the other 3 libraries are not using es6 features because babel still has to do almost the same amount of work regardless

Nope. It has to do significantly more work when certain ES6 features are used. In fact sometimes additional passes of the entire AST are done when you use a specific feature.

phpnode commented 9 years ago

@sebmck ok sure, but this is just a starting point. As time goes on larger babel powered codebases will become available, and babel itself will presumably start using more and more of its own features now that it is self hosted.

sebmck commented 9 years ago

src/babel/transformation/transformers/spec/proto-to-assign.js: Parse start +8ms src/babel/transformation/transformers/spec/proto-to-assign.js: Parse stop +3ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start set AST +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start scope building +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: End scope building +6ms src/babel/transformation/transformers/spec/proto-to-assign.js: End set AST +2ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start module formatter init +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: End module formatter init +2ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-setup +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-setup +1ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-basic +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-basic +2ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-advanced +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-advanced +1ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer regenerator +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer regenerator +31ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-modules +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-modules +6ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-trailing +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-trailing +2ms src/babel/transformation/transformers/spec/proto-to-assign.js: Generation start +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Generation end +3ms

I really don't want to have to gut regenerator :cry: @benjamn any ideas/suggestions?

amasad commented 9 years ago

@phpnode if you want to contribute an automated benchmark that'd be sweet. @sebmck has been using Traceur's codebase and it seems to give a good enough signal.

benjamn commented 9 years ago

I don't have enough context here. Is this just AST transform + traversal time? If you're doing a single pass now, can you feed just generator functions into Regenerator?

benjamn commented 9 years ago

Scope tracking is used pretty sparingly in regenerator, so I suspect it might be feasible to switch to babel's traversal and scope logic, if that seems best.

benjamn commented 9 years ago

@sebmck If we don't have a chance to catch up before JSConf, let's definitely talk then. I have some ideas about avoiding multiple full-AST traversals that might help across the board.

amasad commented 9 years ago

@sebmck Is it the still the case that the bottleneck is the compatibility layer with AST-types? Just talked to @benjamn offline and it seems like th best thing to do is to adapt regenerator to work with babel's AST

sebmck commented 9 years ago

@benjamn

I don't have enough context here. Is this just AST transform + traversal time? If you're doing a single pass now, can you feed just generator functions into Regenerator?

It's basically just the time for regenerator.transform. Unsure what the overhead is since there aren't any generators in any of these files. Will try to dig in more.

If we don't have a chance to catch up before JSConf, let's definitely talk then. I have some ideas about avoiding multiple full-AST traversals that might help across the board.

Absolutely.

@amasad

Is it the still the case that the bottleneck is the compatibility layer with AST-types?

No idea. There's no real compatibility layer besides just patch.js which could probably just be sent as a patch upstream.

zdychacek commented 9 years ago

And what about parallel compilation option? :)

Why not to utilize all available cores? Maybe it sounds silly, but it works :)

I've just made an experiment using node cluster module and forked n workers to do work in parallel.

I've achieved performance increase from cca 18s down to cca 7s on my MacBook Air when running in parallel on cca 1MB codebase.

I agree, that this is kind of hack, but...

amasad commented 9 years ago

I also do that, but it's still slow for large codebases. The fact that JavaScript runtimes doesn't have shared memory makes it worse.

On Thursday, May 14, 2015, Ondřej Ždych notifications@github.com wrote:

And what about parallel compilation option? :)

Why not to utilize all available cores? Maybe it sounds silly, but it works :)

I've just made an experiment using node cluster module and forked n workers to do work in parallel.

I've achieved performance increase from cca 18s down to cca 7s on my MacBook Air when running in parallel on cca 1MB codebase.

I agree, that this is kind of hack, but...

— Reply to this email directly or view it on GitHub https://github.com/babel/babel/issues/1486#issuecomment-101953620.

sebmck commented 9 years ago

Following f3f60368da768444f1d351844970137eadb27691, Traceur core now compiles in 14s (from 17s) and Ember core in 28s (from 30s). Just as a frame of reference, in 5.2, Ember took 50s and Traceur took 30s. :nail_care: :sparkles:

amasad commented 9 years ago

:open_mouth: awesome!

kevinbarabash commented 9 years ago

@sebmck How much time is spent parsing relative to the total time? The reason I ask is that I started working on a project to update an existing AST based on edit operations like insert "a" at line 5, column 10 and delete character at line 20, column 6. This would help with the use case of people editing large files using a watcher. Unfortunately, it would require some sort of integration with editors.

sebmck commented 9 years ago

@kevinb7 Parsing time is insignificant and relatively easy to optimise compared to everything else.

sebmck commented 9 years ago

Following ba19bd36a4f3fc75dca5461dd92ab83ea0d4f863, the scope tracking has been optimised into a single traversal (instead of multiple). Ember core now compiles in 24s (from 28s).

sebmck commented 9 years ago

Following #1753 and #1752 (thanks @loganfsmyth!) and some other previous performance patches since the last update, Ember core now compiles in 18s and Traceur in 10s. :sparkles:

RReverser commented 8 years ago

I tried to play with parsing performance for big codebases, and while could improve it by ~25%, the actual difference in seconds is pretty low so not sure whether it will affect the overall result. https://twitter.com/RReverser/status/617334262086410240

amasad commented 8 years ago

@RReverser it might add up. I'm not sure how much time is spent on parsing on our codebase now, but we can probably find out. Are there any trade offs to checking in your optimization?

cc @DmitrySoshnikov

RReverser commented 8 years ago

I'm currently somewhat blocked by #1920 for this commit :(

In any case, I just optimized case of large codebases (in sense of count of files) like MaterialUI (which has 980 files). And, as you can see from tweet screenshot, difference is not that huge.

I'll let you know as soon as I get issue fixed and this thing commited.

michaelBenin commented 8 years ago

loganfsmyth commented 8 years ago

@michaelBenin If I had to guess, that's more likely to be https://github.com/babel/babel/issues/626. Transpilation speeds are actually pretty good these days, and gulpfiles are generally pretty small.

babel-bot commented 7 years ago

Comment originally made by @thejameskyle on 2015-11-20T19:34:13.000Z

@sebmck how do you want to handle this? Do you want to clean this issue up a bit with the latest, close it in favor of separate tickets, or close it as something Babel will just always be working on?

danez commented 7 years ago

Closing this, as a lot has happened since the last comment. If there are still areas that should be looked at we should create separate issues for them.

babel / babel

Speeeeed #1486

Code generator

Grouping more transformers

Optimising existing transformers

Optimising scope tracking

Attach comments in parser

Include comments in token stream

Address regenerator