Open kangax opened 7 years ago
Couple more thoughts:
Blacklisting strings could be brittle in cases of large codebases with lots of developers potentially using various workarounds. E.g., it would fail with: '<scri' + 'pt></scri' + 'pt>'
We could also entertain a directive:
//= babili-no-concat-start
div.innerHTML = '<scri' + 'pt></scri' + 'pt>';
//= babili-no-concat-end
Alternatively, we can replace "</" sequence with unicode representation (\u003c\u002f) but I'm not sure if it's minifier's job to do this.
If you decide to go that route, note that just \u003c\u002fscript
is probably overkill — just <\/script
will do. Also note that this is only necessary for </script
and </style
, so e.g. </foo
can be left as-is.
On the other hand, <!--
must also be escaped (e.g. \x3C!--
).
(This is what jsesc’s isScriptContext
option does.)
See https://mathiasbynens.be/notes/etago & https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements for more background.
I see your point re: “not sure if it’s a minifier’s job to do this”. Pragmatically speaking though, it makes sense to (at least) provide an option to enable such escaping (or to avoid concatenation, but that seems harder and less ideal). I remember having the same discussion on the UglifyJS repository before their inline_script
option was added. (Unfortunately, they deleted GitHub Issues from the repo so I can’t link to it.)
babel-generator already uses jsesc, so it could easily hook in to jsesc’s isScriptContext
option.
Google Closure escapes script tags and HTML comments by default.
I thought Uglify also escapes such things by default, but apparently it needs a flag:
$ echo 'console.log("</" + "script>");' | uglifyjs -b inline_script=true,beautify=false -cm
console.log("<\/script>");
This will be addressed by https://github.com/babel/babel/pull/5581
Originally brought up by @spicyj.
Since
'</' + 'script'
is a pretty common way of avoiding "</script" strings inside script tags, we need to figure out how to allow for cases like this.Alternatively, we can replace "</" sequence with unicode representation (
\u003c\u002f
) but I'm not sure if it's minifier's job to do this.One immediate solution I see is to add an option of blacklisting concatenable strings:
It's unclear which logic to use for a list of blacklisted strings — when ALL match or when ANY match? If we use ANY then we could be missing out on concatenating "</div", "</some:component", etc.
If we were to go with escaping:
But then it feels like something Babel (Babel's printer, actually) should be taking care of.
I'm curious to hear what @mathiasbynens thinks.
/cc @hzoo