microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS
https://github.com/microcosm-cc/bluemonday
BSD 3-Clause "New" or "Revised" License
3.12k stars 176 forks source link

Translates string characters to html code #154

Closed GuillemXanxo closed 1 year ago

GuillemXanxo commented 1 year ago

Hi! I am using this package to sanitize a form input. I am creating a policy as:

policy := bluemonday.StrictPolicy()

and then I send stringA, which is an input from the form, to be sanitized as:

stringA = policy.Sanitize(stringA)

It happens as follows:

stringA := "I'm A. Don't want to be B"
stringA = policy.Sanitize(stringA)
fmt.Println(stringA) //--> I'm A. Don't want to be B

Why are ' transformed to the html code?

Thank you!

buro9 commented 1 year ago

This is a HTML sanitizer and we rely on the HTML processing provided by golang.org/x/net. It is this library that translates HTML entities.

It is possible that we could decode such entities, but you should know that doing so re-introduces risk and should only be done with consideration and an understanding of your use of bluemonday (which policy and features you're using) and where the input originates. It's for this reason that whilst it is possible we do not include that capability within bluemonday, too many people would use it without applying the correct consideration.

If you do wish to do that (and truly it's only applicable to the StrictPolicy()) then look up https://pkg.go.dev/html#example-UnescapeString

GuillemXanxo commented 1 year ago

Hi @buro9 . As you proposed, I have unescaped the text once it has been sanitized.

stringA := "<p>I'm A. Don't want to be B</p>"
stringA = policy.Sanitize(stringA)
stringB := html.UnescapeString(stringA)
fmt.Println(stringA) //--> I&#39;m A. Don&#39;t want to be B
fmt.Println(stringB) // --> I'm A. Don't want to be B

However, I created a new project to check sanitization from scratch and it was working fine yesterday. Then, I realized that golang.org/x/net received a patch on oct 14th and I guess that was the real problem.

Thank you for such a quick answer. Great Job!