OWASP / java-html-sanitizer

Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.
Other
850 stars 214 forks source link

<span> elements get removed even when allowed by the policy #283

Open kocakosm opened 1 year ago

kocakosm commented 1 year ago

Hi,

<span> elements get removed by the sanitizer even when they are allowed by the policy.

For instance I'd expect the following code :

Sanitizers.FORMATTING.sanitize("<span>Hi!</span>")

to return <span>Hi!</span> instead of Hi!.

The exact same behaviour can be observed with a custom policy :

new HtmlPolicyBuilder().allowElements("span").toFactory().sanitize("<span>Hi!</span>")

returns Hi! instead of <span>Hi!</span>.

Also, note that other HTML5 inline formatting elements (such as b, i, s, u, sup, sub, ins, del, strong, code, small and em) are not affected by this "bug".

Thanks for your help.

kocakosm commented 1 year ago

You can see this behaviour in this sample project.

csware commented 8 months ago

Empty span is dropped, because it is part on DEFAULT_SKIP_IF_EMPTY.

You need to allow it using allowWithoutAttributes. cf. https://github.com/OWASP/java-html-sanitizer/blob/91c5fdc146a01aab1e8b0db38be449a960fe88c1/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java#L712-L723