babel / minify

:scissors: An ES6+ aware minifier based on the Babel toolchain (beta)
https://babeljs.io/repl
MIT License
4.39k stars 225 forks source link

Allow for exceptions in string concat inlining #382

Open kangax opened 7 years ago

kangax commented 7 years ago

Originally brought up by @spicyj.

Since '</' + 'script' is a pretty common way of avoiding "</script" strings inside script tags, we need to figure out how to allow for cases like this.

Alternatively, we can replace "</" sequence with unicode representation (\u003c\u002f) but I'm not sure if it's minifier's job to do this.

One immediate solution I see is to add an option of blacklisting concatenable strings:

babel.transform(code, { presets: [['babili', { concatStringsBlacklist: '</', 'script' }]] });

It's unclear which logic to use for a list of blacklisted strings — when ALL match or when ANY match? If we use ANY then we could be missing out on concatenating "</div", "</some:component", etc.

If we were to go with escaping:

babel.transform(code, { presets: [['babili', { unicodeStrings: ['</'] }]] });

But then it feels like something Babel (Babel's printer, actually) should be taking care of.

I'm curious to hear what @mathiasbynens thinks.

/cc @hzoo

kangax commented 7 years ago

Couple more thoughts:

mathiasbynens commented 7 years ago

Alternatively, we can replace "</" sequence with unicode representation (\u003c\u002f) but I'm not sure if it's minifier's job to do this.

If you decide to go that route, note that just \u003c\u002fscript is probably overkill — just <\/script will do. Also note that this is only necessary for </script and </style, so e.g. </foo can be left as-is.

On the other hand, <!-- must also be escaped (e.g. \x3C!--).

(This is what jsesc’s isScriptContext option does.)

See https://mathiasbynens.be/notes/etago & https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements for more background.


I see your point re: “not sure if it’s a minifier’s job to do this”. Pragmatically speaking though, it makes sense to (at least) provide an option to enable such escaping (or to avoid concatenation, but that seems harder and less ideal). I remember having the same discussion on the UglifyJS repository before their inline_script option was added. (Unfortunately, they deleted GitHub Issues from the repo so I can’t link to it.)

mathiasbynens commented 7 years ago

babel-generator already uses jsesc, so it could easily hook in to jsesc’s isScriptContext option.

kzc commented 7 years ago

Google Closure escapes script tags and HTML comments by default.

I thought Uglify also escapes such things by default, but apparently it needs a flag:

$ echo 'console.log("</" + "script>");' | uglifyjs -b inline_script=true,beautify=false -cm
console.log("<\/script>");
boopathi commented 7 years ago

This will be addressed by https://github.com/babel/babel/pull/5581