observablehq / htl

A tagged template literal that allows safe interpolation of values into HTML, following the HTML5 spec
https://observablehq.com/@observablehq/htl
ISC License
305 stars 24 forks source link

Incorrect escaping inside STYLE and SCRIPT elements. #6

Closed mbostock closed 3 years ago

mbostock commented 4 years ago

STYLE and SCRIPT elements have special behavior regarding escaping: they use “RAWTEXT” rather than “DATA” mode. For example, this is not an ampersand character reference:

html`<script>hello&amp;</script>`

I think this means we’ll need to track the element name when we enter the DATA state, and if it’s STYLE or SCRIPT, enter the RAWTEXT state instead. And then likewise we’ll have to handle the “appropriate end tag” to determine when we exit the RAWTEXT state.

mbostock commented 4 years ago

As another example, the ampersand should not be encoded as &#38; here:

html`<style>

p {
  background-image: url(${"foo.png?bar=1&baz=2"});
}

</style>`
clarkevans commented 3 years ago

We implemented this within a fork of this project for Julia, this turned out to be relatively easy. Most of the state machine already tracks what you need to keep the current element name, and you only have to match on style/script tags (case independently). Note that besides not permitting <style> or <script> literal values, additionally <script> doesn't like comments <!--. There's a design choice to try and save the user from injecting these values, or, letting them do it (since including <script> within Javascript string is a gotya that you can find on stackoverflow). At this time, our fork is trying to detect this error, but there's a speed cost, so we may rip it out.

mbostock commented 3 years ago

@clarkevans Thanks for the pointer! I’d like to fix this together with #18; see #21 (comment).