Closed sebmck closed 7 years ago
Out of curiosity, how did you conclude that these described parts were the bottlenecks? Just runtime complexity analysis? Or was a profiling tool used?
@mrjoelkemp
Basically been a combo of:
Also you can set the DEBUG
environment variable to babel
and get a bunch of debugging output about how long parsing, transforming and generating takes which is where I've noticed some of the hot points.
$ DEBUG=babel babel script.js
It might be worthwhile to check-in the benchmark program to run after major features to make sure we're not regressing. And to also have something to point to when doing performance work. I don't have experience with automated benchmarks. Maybe someone from @lodash or other perf-focused projects can help.
Any advice/references @jdalton? Performance regression tests would actually be amazing.
For me the biggest problem is Babel's startup time. Perhaps this isn't a problem for people who use JavaScript build tools like Gulp, but for the rest of us, it is. babel --version
takes like 530ms for me--for comparison, it used to be like <200ms for 6to5.
You can shave off about 100-150ms (don't remember, sorry) by using browserify --bare --ignore-missing
to bundle all of the dependencies into a single file. Don't know where the rest is coming from.
Are you using an npm release or a locally linked copy? npm releases are going to be far quicker since the internal templates will be precompiled.
@sebmck npm. Here's babel's startup time over versions (run several times):
time /usr/local/bin/iojs node_modules/.bin/6to5 --version
1.15.0
/usr/local/bin/iojs node_modules/.bin/6to5 --version 0.22s user 0.01s system 104% cpu 0.217 total
time /usr/local/bin/iojs node_modules/.bin/6to5 --version
2.13.7
/usr/local/bin/iojs node_modules/.bin/6to5 --version 0.28s user 0.02s system 104% cpu 0.284 total
time /usr/local/bin/iojs node_modules/.bin/6to5 --version
3.6.5
/usr/local/bin/iojs node_modules/.bin/6to5 --version 0.34s user 0.02s system 106% cpu 0.333 total
time /usr/local/bin/iojs node_modules/.bin/babel --version
4.7.16
/usr/local/bin/iojs node_modules/.bin/babel --version 0.36s user 0.04s system 106% cpu 0.373 total
time /usr/local/bin/iojs node_modules/.bin/babel --version
5.2.17
/usr/local/bin/iojs node_modules/.bin/babel --version 0.43s user 0.04s system 105% cpu 0.440 total
The 530ms figure is for a virtual machine, not sure why it's slower, but whatever. Anyway--we've doubled our startup time since the early days. Here's the time bundled:
time /usr/local/bin/iojs ./babel-bundle.js --version
5.2.17
/usr/local/bin/iojs ./babel-bundle.js --version 0.30s user 0.04s system 102% cpu 0.326 total
One idea I've had is making a persistent Babel "server" and writing a CLI application with the same interface that contacts the server. Similar to Nailgun for Java. Would also let us take advantage of the JIT since any optimization is thrown away each invocation currently.
Probably has largely to do with just the additional lines of code/dependencies and that a lot of stuff is done at runtime (visitor normalisation etc).
One idea I've had is making a persistent Babel "server" and writing a CLI application with the same interface that contacts the server.
I believe @thejameskyle had some thoughts on this. It's something TypeScript does and it gives IDEs access to internal TypeScript information to improve integration. Not sure how relevant it'd be for Babel but it's possibly something worth looking into.
We already do that in the React Packager (currently hidden in React Native, but we have plans on releasing it as a standalone thing). And I think webpack does it for you when you use the dev server? If not I think it might webpack might be a good place to add that.
@amasad Yeah, plenty of JS build tools do this already. It would be nice of our CLI application did too for those of us who don't use a JS tool as the top-level driver in the build. Maybe this isn't so common in the JS world so I am the only person who has this problem?
So after I merged #1472, compiling the Ember source went from 50s to 44s and Traceur went from 30s to 24s.
After commit f657598c72f5296895d2282b6bb4bd36713a7d42, Ember now compiles in 35s and Traceur in 19s. It was relatively small and had a huge improvement, hopefully there are places where more of the same optimisations can be done. Any help in finding them is much appreciated!
Performance regression tests would actually be amazing.
A simple way to do it is to have babel
registered as a npm dep. Then just run benchmarks on master and the previous version and throw errors when the speed is x%
slower. This can be done pretty easily with benchmark.js.
If anyone wants to wire up some tests under test/benchmark
using benchmark.js
, I'd be happy to modify them up to test against previous/multiple versions
@megawac The issue is that simple benchmarks are pretty useless. It depends completely on feature use and AST structure so getting realistic and practical benchmarks is extremely difficult.
The issue is that simple benchmarks are pretty useless.
Sure but it makes it simplifies the process of detecting regression in certain parts of operations such as parsing, transformations, regeneration, etc. Further it makes retesting the perf effects of a changeset easier than profiling manually. OFC changing how any operations works may change performance, just makes it easier to determine by how much
Large codebases are the only real productive type of benchmarking that you can do (at least that I've found, happy to be proven wrong). Even though the perf work I've done has increased performance by ~40% I've noticed no change in the time it takes to run the Babel tests for example.
The benchmark could simply compare the compile time of babel, regenerator, bluebird and lodash (to pick 3 largish random examples that are already dependencies) with the current master as @megawac said. It doesn't matter that the other 3 libraries are not using es6 features because babel still has to do almost the same amount of work regardless, and babel uses enough babel features itself to make the measurements meaningful.
@phpnode
It doesn't matter that the other 3 libraries are not using es6 features because babel still has to do almost the same amount of work regardless
Nope. It has to do significantly more work when certain ES6 features are used. In fact sometimes additional passes of the entire AST are done when you use a specific feature.
@sebmck ok sure, but this is just a starting point. As time goes on larger babel powered codebases will become available, and babel itself will presumably start using more and more of its own features now that it is self hosted.
src/babel/transformation/transformers/spec/proto-to-assign.js: Parse start +8ms src/babel/transformation/transformers/spec/proto-to-assign.js: Parse stop +3ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start set AST +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start scope building +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: End scope building +6ms src/babel/transformation/transformers/spec/proto-to-assign.js: End set AST +2ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start module formatter init +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: End module formatter init +2ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-setup +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-setup +1ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-basic +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-basic +2ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-advanced +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-advanced +1ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer regenerator +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer regenerator +31ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-modules +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-modules +6ms src/babel/transformation/transformers/spec/proto-to-assign.js: Start transformer builtin-trailing +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Finish transformer builtin-trailing +2ms src/babel/transformation/transformers/spec/proto-to-assign.js: Generation start +0ms src/babel/transformation/transformers/spec/proto-to-assign.js: Generation end +3ms
I really don't want to have to gut regenerator :cry: @benjamn any ideas/suggestions?
@phpnode if you want to contribute an automated benchmark that'd be sweet. @sebmck has been using Traceur's codebase and it seems to give a good enough signal.
I don't have enough context here. Is this just AST transform + traversal time? If you're doing a single pass now, can you feed just generator functions into Regenerator?
Scope tracking is used pretty sparingly in regenerator, so I suspect it might be feasible to switch to babel's traversal and scope logic, if that seems best.
@sebmck If we don't have a chance to catch up before JSConf, let's definitely talk then. I have some ideas about avoiding multiple full-AST traversals that might help across the board.
@sebmck Is it the still the case that the bottleneck is the compatibility layer with AST-types? Just talked to @benjamn offline and it seems like th best thing to do is to adapt regenerator to work with babel's AST
@benjamn
I don't have enough context here. Is this just AST transform + traversal time? If you're doing a single pass now, can you feed just generator functions into Regenerator?
It's basically just the time for regenerator.transform
. Unsure what the overhead is since there aren't any generators in any of these files. Will try to dig in more.
If we don't have a chance to catch up before JSConf, let's definitely talk then. I have some ideas about avoiding multiple full-AST traversals that might help across the board.
Absolutely.
@amasad
Is it the still the case that the bottleneck is the compatibility layer with AST-types?
No idea. There's no real compatibility layer besides just patch.js which could probably just be sent as a patch upstream.
And what about parallel compilation option? :)
Why not to utilize all available cores? Maybe it sounds silly, but it works :)
I've just made an experiment using node cluster module and forked n workers to do work in parallel.
I've achieved performance increase from cca 18s down to cca 7s on my MacBook Air when running in parallel on cca 1MB codebase.
I agree, that this is kind of hack, but...
I also do that, but it's still slow for large codebases. The fact that JavaScript runtimes doesn't have shared memory makes it worse.
On Thursday, May 14, 2015, Ondřej Ždych notifications@github.com wrote:
And what about parallel compilation option? :)
Why not to utilize all available cores? Maybe it sounds silly, but it works :)
I've just made an experiment using node cluster module and forked n workers to do work in parallel.
I've achieved performance increase from cca 18s down to cca 7s on my MacBook Air when running in parallel on cca 1MB codebase.
I agree, that this is kind of hack, but...
— Reply to this email directly or view it on GitHub https://github.com/babel/babel/issues/1486#issuecomment-101953620.
Following f3f60368da768444f1d351844970137eadb27691, Traceur core now compiles in 14s (from 17s) and Ember core in 28s (from 30s). Just as a frame of reference, in 5.2, Ember took 50s and Traceur took 30s. :nail_care: :sparkles:
:open_mouth: awesome!
@sebmck How much time is spent parsing relative to the total time? The reason I ask is that I started working on a project to update an existing AST based on edit operations like insert "a" at line 5, column 10
and delete character at line 20, column 6
. This would help with the use case of people editing large files using a watcher. Unfortunately, it would require some sort of integration with editors.
@kevinb7 Parsing time is insignificant and relatively easy to optimise compared to everything else.
Following ba19bd36a4f3fc75dca5461dd92ab83ea0d4f863, the scope tracking has been optimised into a single traversal (instead of multiple). Ember core now compiles in 24s (from 28s).
Following #1753 and #1752 (thanks @loganfsmyth!) and some other previous performance patches since the last update, Ember core now compiles in 18s and Traceur in 10s. :sparkles:
I tried to play with parsing performance for big codebases, and while could improve it by ~25%, the actual difference in seconds is pretty low so not sure whether it will affect the overall result. https://twitter.com/RReverser/status/617334262086410240
@RReverser it might add up. I'm not sure how much time is spent on parsing on our codebase now, but we can probably find out. Are there any trade offs to checking in your optimization?
cc @DmitrySoshnikov
I'm currently somewhat blocked by #1920 for this commit :(
In any case, I just optimized case of large codebases (in sense of count of files) like MaterialUI (which has 980 files). And, as you can see from tweet screenshot, difference is not that huge.
I'll let you know as soon as I get issue fixed and this thing commited.
@michaelBenin If I had to guess, that's more likely to be https://github.com/babel/babel/issues/626. Transpilation speeds are actually pretty good these days, and gulpfiles are generally pretty small.
Comment originally made by @thejameskyle on 2015-11-20T19:34:13.000Z
@sebmck how do you want to handle this? Do you want to clean this issue up a bit with the latest, close it in favor of separate tickets, or close it as something Babel will just always be working on?
Closing this, as a lot has happened since the last comment. If there are still areas that should be looked at we should create separate issues for them.
Babel could be much faster! Here are some possible areas to look into:
Check list:
es6.tailCall
es6.blockScoping
es6.objectSuper
es6.classes
es6.constants
_shadowFunctions
Code generator
While this isn't a massive issue, and usually doesn't impact most projects, the code generators performance is pretty poor on large outputs. This is partly due to the copious amounts of lookbehinds and attempts to make the code as "pretty" as possible. There is a lot of room for improvement and this is an area where micro-optimisations pay off at a huge scale. Relevant folder is here.
Grouping more transformers
1472 bought the ability to merge multiple transformers into the same AST traversal. This impacts the way internal transformers are written but after a bit of getting used to I think I actually prefer this way. Regardless, transformers are now split into 6 groups. The internal transformers can be viewed here.
Reducing these groups may be complicated due to the various concerns that each one may have. ie.
es3.memberExpressionLiterals
needs to be ran on the entire tree, it needs to visit every singleMemberExpression
even if they've been modified or dynamically inserted.Optimising existing transformers
Some transformers spawn "subtraversals". This is problematic as this negates the intention of minimising traversals signifcantly. For example, previously the
es6.constants
transformer would visit every single node that has it's "own" scope. It then spawns another "subtraversal" that checks all the child nodes for reassignments. This means that there are a lot of unnecessary visiting. Instead, with a75af0a5d20bdba5b93e3bba10529f0bd982810a, the transformer traversal is used and a scope binding lookup is done.This technique (not that ingenius since the previous way was crap) could be used on the
es6.blockScoping
and_shadowFunctions
(this does the arrow functionthis
andarguments
aliasing) transformers.Optimising scope tracking
Similar to optimising existing transformers, the current scope tracking does multiple passes and has room for optimisation. This could all be done in a single pass and then when hitting a binding it could look up the tree for the current scope to attach itself to.
Attach comments in parser
Currently
estraverse
is used to attach comments. This isn't great, it means that an entire traversal is required in order to attach comments. This can be moved to the parser, similar to espree that's used in ESLint.Include comments in token stream
Currently tokens and comments are concatenated together and sorted. This is so newlines can be retained between nodes. This is relatively inefficient since you're sorting a large array of possibly millions of elements.
Address regenerator
Regenerator uses
ast-types
, this means it has it's own set of scope tracking and traversal logic. It's slower than Babel's and is actually relatively heavy, especially on large node trees. There has been a lot of controversy about merging it into Babel and iterating on it from there. Nonetheless, it's a soloution that needs to be considered if all other avenues are unavailable.I welcome contributions of any kind so any help is extremely appreciated!
cc @amasad @DmitrySoshnikov @gaearon @stefanpenner @babel/contributors