evanw / esbuild

An extremely fast bundler for the web
https://esbuild.github.io/
MIT License
38.19k stars 1.15k forks source link

Additional top-level minification #3570

Open benmccann opened 10 months ago

benmccann commented 10 months ago

Here boolean is inlined, but o and d are not. I believe all three variables should be able to be inlined

https://esbuild.github.io/try/#dAAwLjE5LjExAC0tbWluaWZ5AGNvbnN0IG8gPSAnbyc7CmNvbnN0IGQgPSAnZCc7CmNvbnN0IGJvb2xlYW4gPSBmYWxzZTsKdmFyIGZyYWcgPSBgPHAgYXV0b2NhcGl0YWxpemU9IiR7YHcke299ciR7ZH1zYH0iIGNvbnRlbnRlZGl0YWJsZT0iJHtib29sZWFufSI+IDwvcD5gOw

benmccann commented 10 months ago

Ah, although I just realized it does work within a function: https://esbuild.github.io/try/#dAAwLjE5LjExAC0tbWluaWZ5AChmdW5jdGlvbigpIHsKY29uc3QgbyA9ICdvJzsKY29uc3QgZCA9ICdkJzsKY29uc3QgYm9vbGVhbiA9IGZhbHNlOwp2YXIgZnJhZyA9IGA8cCBhdXRvY2FwaXRhbGl6ZT0iJHtgdyR7b31yJHtkfXNgfSIgY29udGVudGVkaXRhYmxlPSIke2Jvb2xlYW59Ij4gPC9wPmA7CmNvbnNvbGUubG9nKGZyYWcpOwp9KSgp

So it seems like what I really want is something like terser's toplevel flag that could drop unused variables at the top level and mangle top-level names

I was a bit confused by the reference to Svelte in https://github.com/evanw/esbuild/issues/3568#issuecomment-1872258022 as a reason for not doing it by default. I'm a Svelte maintainer, and as far as I'm aware we don't do this and the other Svelte maintainers I've asked also aren't quite sure what it might be referring to. So if you've been avoiding it on our behalf, I'm not sure you need to. In fact I'm requesting this to better minify the output of the Svelte compiler where top-level nested template strings being spit out like like this are not uncommon

evanw commented 10 months ago

My Svelte comment in https://github.com/evanw/esbuild/issues/3568#issuecomment-1872258022 was talking about this behavior: https://github.com/evanw/esbuild/issues/604. The code esbuild sees is not the entire module. In particular, the surrounding HTML expects to be able to reference otherwise unused variables in the script tag. From what I understand, this is because Svelte uses esbuild to process the code inside the script tag, which is a module fragment, and then concatenates it with additional generated code to create the full module. Please correct me if I’m misunderstanding something.

benmccann commented 10 months ago

Ah, thank you for the pointer! It appears that issue isn't related to Svelte itself, but a particular plugin svelte-preprocess-esbuild. The Svelte compiler itself only knows how to compile JS/CSS code and so we have "preprocessors", which convert things like PostCSS and TypeScript to JS and CSS. The most used officially supported solution (known as vitePreprocess) does use esbuild to convert TypeScript to JS, but does it without minification. And it looks like svelte-preprocess-esbuild, which is not an official solution and has far less usage, also sets minify: false. Minification would be handled later in a second compile step - typically using esbuild again - which operates on the full source code where top level tree shaking, unused code elimination, mangling, etc. would be completely safe.

So tl;dr, please feel free to be more aggressive if minify is set. If an optimization would occur without minify being set, I'd be happy to investigate further whether it would be safe for Svelte users - it looks like preserveValueImports is now used to solve the issue referred to in https://github.com/evanw/esbuild/issues/604.

privatenumber commented 4 months ago

I think format=esm should strictly mean ESM and require the full module scope to be provided. It's unfortunate that accommodating preprocessed module types (Svelte/Vue.js) comes at the cost of removing the minification for vanilla ESM usages.

For Svetle, maybe a different format can be created (e.g. format=esm-mixed). Or they can pass in the compiled code which has the JS version of the template referencing variables.

benmccann commented 4 months ago

As a Svelte maintainer, I would say there is zero need to do anything special for Svelte here

benmccann commented 2 months ago

I was going to send a PR to revert the special behavior for Svelte, but I don't think it would actually help in this case.

The special Svelte behavior is supposed to only take case when !p.options.minifyIdentifiers:

https://github.com/evanw/esbuild/blob/332727499e62315cff4ecaff9fa8b86336555e46/internal/js_parser/js_parser.go#L16552

But if I pass that flag, things still aren't being minimized when at the top level:

https://esbuild.github.io/try/#dAAwLjE5LjExAC0tbWluaWZ5LXdoaXRlc3BhY2UgLS1taW5pZnktaWRlbnRpZmllcnMgLS1taW5pZnktc3ludGF4IC0tZm9ybWF0PWVzbQBjb25zdCB4ID0gImZvbyI7CmNvbnN0IHkgPSBgaGVsbG8gJHt4fWA7CgpmdW5jdGlvbiB6KCkgewogIGNvbnN0IGEgPSAiZm9vIjsKICBjb25zdCBiID0gYGhlbGxvICR7YX1gCn0

hyrious commented 2 months ago

@benmccann The code you referenced is used to "scan imports and exports", which doesn't do anything about drop plain variables. There're 2 different variables need to be preserved inside svelte code fragments (import names and unused variables) and in esbuild it is controlled by different aspects (both in transform mode):

  1. Unused imports, controlled by TypeScript "preserveValueImports" or the more modern setting "verbatimModuleSyntax".

    import { foo } from "foo"
    // ^ Preserve this line.
  2. Unused variables that can be inlined and renamed.

    The inlining happens with "--minify-syntax" which replaces top-level "const" expressions with the literal value. After the replacement, some constants are never used and can be dropped if tree-shaking is enabled. In transform mode tree-shaking is disabled by default.

    The renaming happens when esbuild knows it is safe to do so -- code is the entire graph. When you pass "--format=..." it is considered as complete code. So svelte preprocessors just don't pass it.

So back to your initial question. If you are processing the entire code (not a piece of code fragment like in svelte), and want to minify and drop some unused variables' names in transform mode, just enable "--format={your code format} --minify --tree-shaking".

However, string literals aren't inlined in string templates for now (as #3568 doesn't do this).

benmccann commented 2 months ago

However, string literals aren't inlined in string templates

The really confusing thing is that sometimes they are! If you look at this example, it will inline the variable if the template is in a function, but not if the template is at the top-level. I don't really have an explanation for this. It seemed to be suggested that opting out was for the benefit of Svelte, but - as you've pointed out - the Svelte-specific code is totally unrelated to such transformations. So I'm not quite sure why this behavior exists, but would love if it's possible to align the top-level behavior with the behavior inside a function

hyrious commented 2 months ago

@benmccann Yes you're right. I guess I know what happened here: https://github.com/evanw/esbuild/blob/main/internal/js_parser/js_parser.go#L8831

As pointed above, one of the constant inlining strategy is to inline top const numbers (null, boolean, etc. but no string), example.

So why do you see const strings inlined inside functions? This is because it invokes another process: Inline single-use variable declarations, example. It doesn't apply to top-level constants because the parser at this time don't know if there be further export { the_const } which preserves the name.

benmccann commented 2 months ago

Ah, I see! How feasible does this seem to implement? Would a second pass need to be made and is that something that would be considered?

hyrious commented 2 months ago

Although the comments say

We can't just check for the "export" keyword because something might do "export {id};" later on.

I guess this still can be checked since you have all statements in some scope. So for every p.currentScope == p.moduleScope we can search if some variable is exported.