microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS
https://github.com/microcosm-cc/bluemonday
BSD 3-Clause "New" or "Revised" License
3.08k stars 178 forks source link

Test case not sanitising #157

Open aaronpcz opened 1 year ago

aaronpcz commented 1 year ago

I've found a test case that does not sanitize correctly. I've done a preliminary investigation to see if I could contribute a fix, but it doesn't seem like a simple case.

The golang html page is providing the html.Attribute as key="src", val="onmouseover="alert('xxs')"".

{
  in:              `<IMG SRC= onmouseover="alert('xxs')">`,
  expected: ``,
},

Here is the output

        input   : <IMG SRC= onmouseover="alert('xxs')">
        output  : <img src="onmouseover=%22alert%28%27xxs%27%29%22">
        expected: 

Happy to try to contribute a fix but I'm a bit short of ideas, I contemplated trying to re-parse attribute values to identify any nested attributes due to this type of input. Not sure how I'd go about re-parsing just attributes, it doesn't seem like it's something supported in the html package?

aaronpcz commented 1 year ago

I've done a bit more digging, I've just found the func (p *Policy) validURL(rawurl string) (string, bool) method, in here, it is being treated as a relative URL.