OWASP / java-html-sanitizer

Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.
Other
843 stars 213 forks source link

Impossible to disallow text in elements #194

Open eugine opened 4 years ago

eugine commented 4 years ago

I'm trying to exclude a tag (template in the example below) and all content of it.
I use disallowElements and disallowTextIn methods. With the tag it works fine, however I can't exclude the text anyhow.

Is there a bug or what is wrong with the code below?

Here is the code that I use:

var policy = new HtmlPolicyBuilder()
                .disallowElements("template")
                .disallowTextIn("template")
                .allowElements("h1")
                .toFactory();

var html = "<html><h1>allowed text</h1><template><p>excluded-text</p></template><script>script-text</script><html>";

String result = policy.sanitize(html);

assertThat(result, equalTo("<h1>allowed text</h1>"));

Result:

Expected : <h1>allowed text</h1> Actual :<h1>allowed text</h1>excluded-text

The HTML that is used in the example:

<html>
  <h1>allowed text</h1>
  <template><p>excluded-text</p></template>
  <script>script-text</script>
<html>

Version:

    implementation 'com.googlecode.owasp-java-html-sanitizer:owasp-java-html-sanitizer:20191001.1'
jmanico commented 4 years ago

This is important. Templates were used to evade HTML Sanitization in DOMPurify. Here is some research on the topic.

https://securityaffairs.co/wordpress/83199/hacking/google-search-xss-flaw.html

-- Jim Manico @Manicode

On Apr 7, 2020, at 8:40 AM, Eugene Sokolov notifications@github.com wrote:

 I'm trying to exclude a tag (template in the example below) and all content of it. I use disallowElements and disallowTextIn methods. With the tag it works fine, however I can't exclude the text anyhow.

Is there a bug or what is wrong with the code below?

Here is the code that I use:

var policy = new HtmlPolicyBuilder() .disallowElements("template") .disallowTextIn("template") .allowElements("h1") .toFactory();

var html = "

allowed text

";

String result = policy.sanitize(html);

assertThat(result, equalTo("

allowed text

")); Result:

Expected :

allowed text

Actual :

allowed text

excluded-text

The HTML that is used in the example:

allowed text

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.