microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS
https://github.com/microcosm-cc/bluemonday
BSD 3-Clause "New" or "Revised" License
3.2k stars 175 forks source link

Url with ascii char encoded to hex #64

Open batidiane opened 6 years ago

batidiane commented 6 years ago

Hi,

thank you for this awesome package. It helped me a lot on my project. There is just one issue that I try to solve. When sanitizing the following string: http://my-server.com/index.php?name=<script>window.onload = function() {var link=document.getElementsByTagName("a");link[0].href="http://not-real-xssattackexamples.com/";}</script>

it result in: http://my-server.com/index.php?name=

which is find for me.

But, when the evil part of the URL is encoded in hex like: http://your-server/index.php?name=%3c%73%63%72%69%70%74%3e%77%69%6e%64%6f%77%2e%6f%6e%6c%6f%61%64%20%3d%20%66%75%6e%63%74%69%6f%6e%28%29%20%7b%76%61%72%20%6c%69%6e%6b%3d%64%6f%63%75%6d%65%6e%74%2e%67%65%74%45%6c%65%6d%65%6e%74%73%42%79%54%61%67%4e%61%6d%65%28%22%61%22%29%3b%6c%69%6e%6b%5b%30%5d%2e%68%72%65%66%3d%22%68%74%74%70%3a%2f%2f%61%74%74%61%63%6b%65%72%2d%73%69%74%65%2e%63%6f%6d%2f%22%3b%7d%3c%2f%73%63%72%69%70%74%3e

the sanitizer didn't work.

Do I miss something on the way to use the sanitizer? Is there a way to detect such situation and sanitize the string?

Thank you.

grafana-dee commented 6 years ago

Are you able to demonstrate that the link shown executes as a script?

In some cases input is preserved and cannot be told apart from regular URLs or other contexts, however in those cases the input should always be escaped. I have just tried the above URL and it is rendered harmlessly with the User Generated Content policy.

batidiane commented 6 years ago

Hi, thanks for the answer. I did not yet test the rendering, I was just testing the sanitization to validate some assumptions. I will move forward and test the rendering. Will let you know.