microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS
https://github.com/microcosm-cc/bluemonday
BSD 3-Clause "New" or "Revised" License
3.12k stars 176 forks source link

How to add a rule to allow some tags based on `src` #120

Closed inliquid closed 3 years ago

inliquid commented 3 years ago

In particular, I'd like to allow <iframe> only when src contain url of youtube.com or vimeo.com domains. Is this possible somehow? Maybe based on regexp if not on domain itself?

buro9 commented 3 years ago

Today it is only possible by compromising other things. Essentially you can allow a src attribute on an iframe and it will just work... and you can even provide a match regexp. However it won't be fully validated / protected like URLs elsewhere as this library is incomplete in it's handling of iframes... there are little bits of hard-coded parts that know where URLs are present, i.e. a.href script.src but it turns out I didn't complete that and add iframe.src .

I can (and will) add iframe.src, but then another problem emerges... you probably only want to limit the URLs on iframes whilst allowing any a.href link. But the mechanism to do that is presently global, so I'd have to change that too and that's a far more significant piece of work.

I thought for a moment that I could use p.AllowURLSchemeWithCustomPolicy() https://github.com/microcosm-cc/bluemonday/blob/v1.0.10/policy.go#L654 and I could... except for 2 things, this too is global (and would affect all https links) and it's still impacted by the lack of bluemonday knowing about iframes.

The fundamental ask though: validate URLs as URLs and also based on a custom expression... this is a good ask. I'll think about how, and if anyone reading wants to submit a PR for it I'll definitely give it serious attention.

buro9 commented 3 years ago

Resolved by https://github.com/microcosm-cc/bluemonday/commit/cb614698bfbcbdca4b8e177e17b9a9aaf65109d7

Please look at the tests to see how to do this... I would recommend the AllowAttributes().Matching().OnElements() as this is element specific pattern matching whilst retaining URL checking.

It is possible to use scheme custom policies but these are global and would apply to all links on any applicable element.