microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS
https://github.com/microcosm-cc/bluemonday
BSD 3-Clause "New" or "Revised" License
3.12k stars 176 forks source link

Additive policies #125

Closed KN4CK3R closed 3 years ago

KN4CK3R commented 3 years ago

This PR enables additive policies with multiple policies per element and attribute.

Instead of p.AllowAttrs("class").Matching(regexp.MustCompile("red|green")).OnElements("span") you can write now

p.AllowAttrs("class").Matching(regexp.MustCompile("red")).OnElements("span")
p.AllowAttrs("class").Matching(regexp.MustCompile("green")).OnElements("span")

This looks worse at the moment but an example use case are different renderers which add their rules to the policy. Think of an interface method AddRules(p *Policy) and Renderer1 wants to allow the class red and Renderer2 wants to allow the class green. Both call p.AllowAttrs("class").Matching(regexp.MustCompile("xxx")).OnElements("span") but only the last renderer wins. With this PR the sanitizer allows a attribute if one of the rules allows it.

Or this error

buro9 commented 3 years ago

Great addition, thank you :pray: All looks good to me, will merge.

Masterlu1998 commented 2 years ago

I have encountered a situation. I want to extend UGCPolicy and filter image which has special url.

p := UGCPolicy()
p.AllowAttrs("src").Matching(regexp.MustCompile(`htttt`)).OnElements("img")

rawHTML := "<img src=\"http://example.org/foo.gif\">"
fmt.Println(p.Sanitize(rawHTML))

It will out put '<img src=\"http://example.org/foo.gif\">' But output I want is ''

UGCPolicy has allowed all image element. No matter what policy I add on image, they are useless.

Is this a normal situation?

buro9 commented 2 years ago

Please open a new issue for new questions.

But the answer is yes, this is expected.

You've chosen to start the policy from the built-in one called UGCPolicy() which you can find in policies.go. Within that policy there are two uses of the helpers.go, there is AllowStandardURLs() which will permit standard protocols like http, and there is AllowImages() which will permit those URLs against the src attribute on the img element.

As all policies start out denying everything, and then you add more permitted things, this means you've started with a policy that allows things... so it's already allowed by that policy.

If you want to be more restrictive than UGCPolicy() then take a look at it, create a copy in your code base and remove things you didn't want to allow and build a policy that matches what you want to achieve.