emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.79k stars 3.31k forks source link

Use abbreviations for common JS symbols to save on code size #20701

Open sbc100 opened 11 months ago

sbc100 commented 11 months ago

I notices that there are cases where use common JavaScript symbols (e.g. Object.assign) multiple times in a single program and non of our current minification passes (including closure compiler) are able to remove or reduce the duplication of these strings.

Admittedly, gzip will can probably do a good job here, but we can't fully depend on that.

I tried creating simple change where I add common abbreviations to shell.js e.g.:

var assign = Object.assign;
..etc

However this has some downsides. Notably:

Proposed solution: Add a new acorn optimizer pass that can find all the usages of these symbols and inject the abbreviations in the right place.

sbc100 commented 11 months ago

@RReverser @kripken @juj, thoughts on this?

RReverser commented 11 months ago

Admittedly, gzip will can probably do a good job here, but we can't fully depend on that.

TBH I don't see why not. Comparing uncompressed sizes of JS/Wasm is not a very useful metric for network cost in the modern world where ~all assets are served compressed.

The only cost I can think of that cares about uncompressed sizes is JS parsing, but I doubt couple of property accesses (and only a few APIs can be replaced like this without breaking this context) would help much there.

RReverser commented 11 months ago

It also looks like quite a few of these Object.assign can be removed soon in favour of ES6 classes, now that ES6 usage is being unblocked.

sbc100 commented 11 months ago

It also looks like quite a few of these Object.assign can be removed soon in favour of ES6 classes, now that ES6 usage is being unblocked.

I was just using Object.assign as an example... I'm sure there others.

sbc100 commented 11 months ago

Admittedly, gzip will can probably do a good job here, but we can't fully depend on that.

TBH I don't see why not. Comparing uncompressed sizes of JS/Wasm is not a very useful metric for network cost in the modern world where ~all assets are served compressed.

Fair enough, would be more supportive of this feature if it were shown to save on compressed as well as uncompressed size?

RReverser commented 11 months ago

Fair enough, would be more supportive of this feature if it were shown to save on compressed as well as uncompressed size?

Yeah if there's substantial compressed saving then it would make sense. I'm just a bit sceptical of cost:benefit ratio particularly because most built-in methods tend to care about this context and because, as you said, gzip/brotli tend to catch these repetitive cases pretty well. Especially Brotli thanks to larger window size.

That said, I remember coming across https://github.com/emscripten-core/emscripten/issues/20022. If that's what you had in mind - somehow optimising even access to our internal namespaces - then it's more likely to be beneficial.

kripken commented 11 months ago

In general I think closure does this for user stuff, but maybe not for Object.assign since it can't tell if Object might be modified. Maybe closure has an option to assume no monkeypatching? Or maybe other minifiers in the ecosystem do?

sbc100 commented 11 months ago

I think closure does nothing for builtin APIs like this.

Regarding other minifiers, I guess that is something we should look into. Perhaps run all our code size tests with several different minifiers.