Open Rich-Harris opened 7 years ago
Just curious - which version of node were used to produce those timings?
My results are very different with latest butternut 27639ce506848565c3bf42d061aed42519c2f425
node 6.9.0:
$ /usr/bin/time node690 bin/squash test/fixture/input/preact.js | wc -c
0.22 real 0.23 user 0.02 sys
8151
$ /usr/bin/time node690 node_modules/uglify-es/bin/uglifyjs -cm -- test/fixture/input/preact.js | wc -c
0.46 real 0.54 user 0.03 sys
8075
$ /usr/bin/time node690 node_modules/uglify-es/bin/uglifyjs -m -- test/fixture/input/preact.js | wc -c
0.26 real 0.28 user 0.02 sys
8457
node 7.7.3:
$ /usr/bin/time node773 bin/squash test/fixture/input/preact.js | wc -c
0.19 real 0.21 user 0.02 sys
8151
$ /usr/bin/time node773 node_modules/uglify-es/bin/uglifyjs -cm -- test/fixture/input/preact.js | wc -c
0.46 real 0.51 user 0.02 sys
8075
$ /usr/bin/time node773 node_modules/uglify-es/bin/uglifyjs -m -- test/fixture/input/preact.js | wc -c
0.25 real 0.26 user 0.02 sys
8457
7.8.0. This is using Benchmark.js which is probably somewhat misleading as everything gets warmed up, whereas minifiers usually run cold. I intend to replace the current benchmarks with more realistic ones that run each minifier once in a fresh process.
My timings from the command line are pretty cold. :-)
Here's the preact numbers from the previous bench in https://github.com/Rich-Harris/butternut/pull/44
preact.js (20.5 kB) without sourcemap:
✓ babili : 8.41 kB / 3.47 kB in 377ms
✓ butternut : 8.15 kB / 3.4 kB in 29ms
✓ closure : 7.89 kB / 3.35 kB in 2.4s
✓ uglify : 8.07 kB / 3.35 kB in 119ms
✓ uglify-mangle-only : 8.46 kB / 3.38 kB in 31ms
✓ uglify-es : 8.07 kB / 3.35 kB in 144ms
Edit: same machine used for timings seen in https://github.com/Rich-Harris/butternut/issues/110#issuecomment-301896527
@Rich-Harris You'll get a kick out of this:
Running Closure produced three.min.js
through Uglify saves additional 1744 bytes uncompressed, 2860 gzipped
https://github.com/mrdoob/three.js/issues/11003
Props to @mishoo on the Uglify mangling algorithm.
That's wild! Is that mostly due to the mangling, or is it hard to know?
Hmm. I thought it was primarily due to mangling, but that's not the case.
Rather than building it this time, I grabbed the file from a CDN:
original size, not gzipped:
$ wc -c three.min.js
510005 three.min.js
original gzipped:
$ cat three.min.js | gzip | wc -c
129119
uglified with compress=false, mangle=false, gzipped:
$ cat three.min.js | bin/uglifyjs | gzip | wc -c
127298
Very strange! 1821 bytes were saved without compress and without mangle - just eliminating whitespace and perhaps making numbers more compact? That doesn't make much sense - unless there's a ton of copyright comments that were stripped.
Let's call 127298 the baseline.
uglified with compress=false, mangle=true, gzipped:
$ cat three.min.js | bin/uglifyjs -m | gzip | wc -c
126549
mangle savings: 749 bytes from the baseline
uglified with compress=true, mangle=false, gzipped:
$ cat three.min.js | bin/uglifyjs -c | gzip | wc -c
127243
compress savings: 55 bytes from the baseline
uglified with compress=true, mangle=true, gzipped:
$ cat three.min.js | bin/uglifyjs -cm | gzip | wc -c
126369
compress+mangle savings: 929 bytes from the baseline
So mangle did have some effect, but not to the extent I thought.
I was wondering how
uglify-es
was able to compete with Butternut by only mangling variable names. For example, Preact:Butternut's unzipped output is closer to that of Uglify with default settings than with just mangling, yet in this case the zipped output is actually smaller than Butternut's (normally it's the other way around, but it's often close).
Turns out that Uglify is doing something fiendishly clever — it's computing the frequency of characters that end up in the output, and assigning the most common ones first. So instead of
a
,b
,c
etc you might haveo
,a
,n
and so on.We could possibly do something similar. Since Butternut isn't 'generating' code, perhaps the way to do it would be to replace variable names with some cryptic Unicode (e.g. instead of
a
,b
,c
we do⊂0⊃
,⊂1⊃
,⊂2⊃
) then at the end of the process tally up all the characters that aren't enclosed in⊂...⊃
and replace the variable names with a regex.Obviously we'd need to ensure we were using characters that weren't in the code in the first place.