babel / minify

:scissors: An ES6+ aware minifier based on the Babel toolchain (beta)
https://babeljs.io/repl
MIT License
4.39k stars 225 forks source link

Lexical variables in the outermost lexical scope don’t get mangled #903

Open mathiasbynens opened 6 years ago

mathiasbynens commented 6 years ago

Lexical variables in the outermost lexical scope don’t get mangled, i.e. they appear to be incorrectly treated as global variables.

To Reproduce

Minimal code to reproduce the bug:

// foo.mjs
const foo = 42;
const bar = 64;
$ minify foo.mjs

Actual Output

const foo=42,bar=64;

Expected Output

const a=42,b=64;

I’m using the babel-minify CLI, v0.4.3, with no additional configuration.

As a workaround, I have to wrap my source code in a block: { … }. That way, babel-plugin-minify-mangle-names seems to kick in correctly.

boopathi commented 6 years ago
minify --mangle.topLevel foo.mjs

should mangle the top-level vars. But we can have this enabled by default for mjs files.

mathiasbynens commented 6 years ago

I think that solves a different problem. There are two kinds of “top-level vars”:

  1. global variables, such as var foo = 42; in the global scope
  2. top-level lexical bindings, such as const bar = 42; in the outermost lexical scope

It’s safer to mangle the latter than it is to mangle the former.

--mangle.topLevel mangles both.

I think the underlying problem here is that there’s no way to distinguish between the two. Maybe we should add something like --mangle.topLevelLexical to denote that you only want to mangle those bindings?

vigneshshanmugam commented 6 years ago

@mathiasbynens In terms of module context, Can't we assume both are same?

mathiasbynens commented 6 years ago

@vigneshshanmugam For modules specifically, yes (since var foo = 42; does not create a global foo within a module). But for non-modules, the difference matters: topLevelLexical mangling would be acceptable whereas topLevel is potentially dangerous.

mathiasbynens commented 6 years ago

To clarify what I mean:

  1. Enabling --mangle.topLevel for .mjs files (as @boopathi suggested) solves the problem for modules. It would be great to do this!
  2. Adding something like --mangle.topLevelLexical would enable this functionality for non-modules too.
kzc commented 6 years ago

for non-modules, the difference matters: topLevelLexical mangling would be acceptable whereas topLevel is potentially dangerous

@mathiasbynens I must confess that this is the first I've heard of top-level lexical bindings, so apologies in advance for my lack of understanding...

Could you please demonstrate a non-module scenario where a user would want to treat top level lexicals differently than globals? Since topLevel is an opt-in used only when the user deems it to be safe for both types of "top-level vars", why is it a problem?

mathiasbynens commented 6 years ago

@kzc

<script>
  // `foo` is not a global, but rather a top-level lexical binding.
  const foo = 42;
  // The same would happen if it was a `let` or a `class`.
</script>
<script>
  console.log(foo);
  // → 42
  console.log(window.foo);
  // → undefined
</script>

Attaching a binding to the global scope is an explicit signal that other scripts might want to depend on it, and that you don’t want to mangle it. Attaching a binding to the top-level lexical scope does not have that same signal. Therefore, it is safer to mangle such bindings in a lot more cases than it is to mangle globals.

kzc commented 6 years ago

console.log(window.foo);

@mathiasbynens Thanks for the illustrative example.

I don't think this is an issue in practice because most folks explicitly anchor their globals via window.foo = foo; (or via global in node). The property in window.foo is not mangled by default - even with topLevel (or toplevel with uglify/terser). People using these minifiers have been conditioned to not mix the global/top-level styles window.foo and foo within their code. Also, babel-minify, terser and uglify have ways to exclude specific top level symbols from being mangled if need be.

bakkot commented 6 years ago

@mathiasbynens, for what it's worth, I would be very surprised if those names were changed without --mangle.topLevel. I expect the default behavior to be that compiled programs have the same observable semantics as the input, modulo Function.prototype.toString and function names (and also stack traces and other non-spec things). And the names in the top-level lexical scope are very much part of the observable semantics of a program, just like names in the global scope.