cure53 / DOMPurify

DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:
https://cure53.de/purify
Other
13.67k stars 698 forks source link

Does DOMPurify handle escaping the closing `</script>` tag problem? #840

Closed rafaeleyng closed 1 year ago

rafaeleyng commented 1 year ago

Does DOMPurify handle escaping the closing </script> tag problem?

Background & Context

If an embedded script contains any occurrence of </script> (even within a JavaScript string, for instance), it will close the script tag. </script> needs to be escaped as <\/script>.

A few references:

  1. https://uploadcare.com/blog/vulnerability-in-html-design/
  2. https://mathiasbynens.be/notes/etago
  3. https://www.herongyang.com/JavaScript/Browser-Escape-Script-Tag-in-String-Literal.html
  4. https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements
  5. https://stackoverflow.com/questions/28643272/how-to-include-an-escapedscript-script-tag-in-a-javascript-variable

In my specific use case, I'm using dynamic data to build the content of a script tag (not JavaScript, though, but JSON-LD).

It is not clear to me if and how I can use DOMPurify to achieve this result without escaping or removing anything else from the string.

So I would like a way to provide the string:

hello <script>there</script>

and instead of getting (which is what I currently get, with the default configuration):

hello 

I would like to get:

hello <script>there<\/script>

There are issues like casing, the possibility of adding spaces after </script, etc, that discourage me from trying to solve this with a custom regex, but I haven't found yet a library that clearly supports this specific use case.

In my specific case, this is closer to:

{
  "@type": "Article",
  "name": "Some dynamic data that could contain: </script><script>alert('xss')</script>"
}

that I would like to escape to:

{
  "@type": "Article",
  "name": "Some dynamic data that could contain: <\/script><script>alert('xss')<\/script>"
}

All of that would live inside a surrounding tag like (this tag itself is not dynamic, only the content within it):

<script type="application/ld+json">
</script>

Is this possible to achieve with DOMPurify and what would be the configuration required? If not possible, do you know a library that solves this?

Thanks!

cure53 commented 1 year ago

What you look for is escaping, what we offer is sanitization. You can likely fix this with a hook, but please handle with care :)