microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS
https://github.com/microcosm-cc/bluemonday
BSD 3-Clause "New" or "Revised" License
3.08k stars 178 forks source link

Prefer explicit rules over regexp #182

Closed KN4CK3R closed 1 year ago

KN4CK3R commented 1 year ago

175 introduced a potential dangerous change. If a user registers the regexp .+ for scheme validation (as written in the comment) to allow all possible schemes, a link like <a href="javascript:..."> is valid too. The Go regexp module does not implement negative lookaheads, so you can't write "all but xyz" ((?!javascript|vbscript)).

This PR moves the regexp check a little bit down to be only executed if there was no other explicit scheme registration was found. So now

p.AllowURLSchemesMatching(regexp.MustCompile(`.+`))
p.AllowURLSchemeWithCustomPolicy("javascript", func(*url.URL) bool {
    return false
})

will allow every scheme but javascript.

An alternative would be to drop AllowURLSchemesMatching again and add methods DisallowURLSchemes and DisallowURLSchemeWithCustomPolicy.

buro9 commented 1 year ago

Thank you, this is an excellent catch and a great addition.