mrdoob / three.js

JavaScript 3D Library.
https://threejs.org/
MIT License
102.51k stars 35.37k forks source link

Support advanced (property mangling) minify tools #29210

Open AshleyScirra opened 2 months ago

AshleyScirra commented 2 months ago

Description

Some JavaScript minification tools support reducing the size of code even further by renaming properties. This is called property mangling in UglifyJS, or ADVANCED optimization mode in Closure Compiler. In summary it works like this: suppose you have the following code.

const obj = {
    apples: 1,
    oranges: 2
};
console.log(obj.apples, obj.oranges);

Property mangling will rename the object properties consistently, so apples is renamed to a, oranges is renamed to b, and so on, producing output that looks like this:

const obj = {
    a: 1,
    b: 2
};
console.log(obj.a, obj.b);

This can however break code that is not written to expect this. The main problem is mixing string and dot property syntax. For example if you access a property in obj using a runtime string like obj[someStr], then the names are different and the code is broken. To better support this case, both Closure Compiler and UglifyJS support preserving the names of properties that use string syntax. For example code written this way would not have its properties renamed:

const obj = {
    "apples": 1,
    "oranges": 2
};
console.log(obj["apples"], obj["oranges"]);

That allows for using property mangling as well as string lookups, providing property accesses consistently use either string syntax or dot syntax, but not a mix, as that means only dot syntax is renamed and then no longer matches the string syntax.

Using data from Construct - a large JavaScript-based game engine - property mangling can further reduce the code size over "simple" (non-property-mangling) minification by ~35-45% uncompressed and ~15-25% compressed. (The range depends on the tool used - Closure Compiler tends to do better than UglifyJS.)

Solution

Currently three.js mixes dot and string property syntax and so is broken after property mangling minification. After debugging this, one area I identified was the use of ShaderLib: in ShaderLib.js the ShaderLib defines properties with unquoted syntax (e.g. basic); however in WebGLPrograms.js it does a runtime string lookup with ShaderLib[ shaderID ]. It looks like this pattern is used in a few places.

The fix is to ensure all uses of property accesses are consistently string syntax or dot (unquoted) syntax. For example in this case, the fix would be to ensure ShaderLib properties use string syntax, e.g. "basic". Other such cases will need to be reviewed across the codebase. I would also suggest to use Map for dictionary-style storage like this, as it's a natural fit, avoids some pitfalls with object properties (like prototype), and you can only use string syntax, so it actually prevents using dot syntax and so avoids any syntax mix-ups.

Both Closure Compiler and UglifyJS provide a "debug" mode for property mangling to help identify problems. This renames properties but keeps the original name - for example UglifyJS renames obj.apples to obj._$apples$_, so the property is renamed, but you can still see what it was meant to be when debugging problems.

Once set up it's pretty straightforward to keep things working - it just needs consistency across the codebase with property syntax. It may be a relatively easy change to make, but it should probably be done by someone familiar with the codebase. I don't think it's an onerous requirement to maintain in future, and it helps further reduce the download size and parse/load time as three.js is a pretty large library.

Alternatives

I think the only alternative is to keep using "simple" (non-property-mangling) minification. This doesn't require any code changes and tends to just work with whatever code you write, so is a bit easier to maintain. However it may turn out to be straightforward to support and maintain support for property mangling, which would make it a relatively easy win.

Additional context

With Construct we've supported property mangling in our engine for years, and I wouldn't say it's a significant burden.

mrdoob commented 2 months ago

Wait. Does Construct use Three.js?

mrdoob commented 2 months ago

As per the suggestion... Better tree-shaking is something we're trying to address in the new renderer (WebGPURenderer).

donmccurdy commented 2 months ago

@AshleyScirra would it be possible to enforce the requirements you're describing using ESLint rules, or something similar?

AshleyScirra commented 2 months ago

Wait. Does Construct use Three.js?

At the moment it's not used in the built-in engine, but we're experimenting with integrating it for 3D purposes. Construct supports advanced minification, so it would be good if three.js did too, otherwise the use of three.js blocks using advanced minification.

Better tree-shaking is something we're trying to address in the new renderer

Tree shaking helps reduce the library size, but I'd point out property mangling will help reduce it even further, no matter what the result of tree shaking is.

would it be possible to enforce the requirements you're describing using ESLint rules, or something similar?

I'm not aware of any tools that can enforce this during development - I would guess the dynamic nature of JavaScript makes it difficult for static analysis to reliably identify objects and determine whether property accesses are done consistently. Both dot and string property syntax are allowed - such a tool should only flag inconsistent use of them, which might not be feasible. Perhaps it would be possible to run tests against a property-mangled version of the library (although the test code would also need to be property mangled). Some extra codebase rules like "use Map instead of object properties for dictionary-style storage" might help too since as I mentioned you can't mix up syntax with Map.

mrdoob commented 2 months ago

I see....

We do not plan on working on this. But if someone in the community wants to give it a go we're definitely open to PRs.

AshleyScirra commented 2 months ago

PR #29260 is an initial attempt to support this. It should be a no-op change for all existing uses.

I think that brings property mangling support to the core library, but the next big hurdle looks like other modules like GLTFLoader that read JSON. All the JSON properties read externally must also use string syntax, and that looks like a bigger change to make. I might leave things as-is for the time being but the PR is a significant step towards supporting this, if you decide it can be merged.