leizongmin / js-xss

Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist
http://jsxss.com
Other
5.21k stars 630 forks source link

Fix slashes as separators. #269

Open hensleysecurity opened 1 year ago

hensleysecurity commented 1 year ago

This should resolve #268.

leizongmin commented 1 year ago

The mainstream html parser will parse it into the following results:

[
  {
    "tag": "img",
    "attrs": {
      "width": "100/height=200/src=\"#\"/"
    }
  }
]

image

I think treating the slash as a delimiter might present a potential problem.

hensleysecurity commented 1 year ago

@leizongmin I am not able to replicate your result. It looks like maybe you wrapped the slashes inside quotes in your source HTML. Please try the following example instead:

<html>
  <body>
    <img/width="100"/height="200"/src="#"/>
  </body>
</html>

I see the behavior I would expect when I try this in Chrome: console

And Firefox as well: console2

Also please see the HTML spec:

When the prescan a byte stream to determine its encoding algorithm says to get an attribute, it means doing this:

If the byte at position is one of 0x09 (HT), 0x0A (LF), 0x0C (FF), 0x0D (CR), 0x20 (SP), or 0x2F (/) then advance position to the next byte and redo this step.

In other words, when getting HTML attributes, a properly implemented web browser will treat a / the same as a whitespace character, so I think it's very important that this project's code does the same thing.