OWASP / java-html-sanitizer

Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.
Other
843 stars 213 forks source link

possible xss attack in StandardUrlAttributePolicy #213

Open saaspeter opened 3 years ago

saaspeter commented 3 years ago

if user input sting is:

<body>aaa bbb<a href=\"jav&#97script:alert(1)\">test1</a></body>

policy defined as this:

PolicyFactory LINKS_RAW = (new HtmlPolicyBuilder()).allowElements("a").allowStandardUrlProtocols().allowAttributes("href","target").onElements("a").toFactory()

when check the "a" tag attribute href, StandardUrlAttributePolicy will be used to the the href value, this method: apply(String elementName, String attributeName, String value) will be called, but since there is '#' in the value, so it skipped the protocol check, and finally return the input href value, so even the value which not contain any protocol in "HTTP, https, mailto" , will not be filtered.

image

is this a bug?

yangbongsoo commented 3 years ago

@saaspeter Hello. I am sanitizer user. I tested what you mentioned.

  @Test
  public void testIssue213() {
    PolicyFactory policyFactory = new HtmlPolicyBuilder()
            .allowElements("a")
            .allowStandardUrlProtocols()
            .allowAttributes("href", "target")
            .onElements("a")
            .toFactory();
    String sanitize = policyFactory.sanitize("<a href=\"jav&#97script:alert(1)\">test1");
    System.out.println(sanitize); // <a href="jav&amp;#97script:alert%281%29">test1</a>

the result string is <a href="jav&amp;#97script:alert%281%29">test1</a> even if href attribute value is remained, that string is impossible to attack. sanitizer transform & to &amp;

스크린샷 2020-11-30 오후 4 17 16

mikesamuel commented 3 years ago

Thanks for the report. Please do try to report bypasses as described in the issue template

. Please report security vulnerabilities via OWASP's vulnerability rewards program.

I believe @yangbongsoo is correct. &#97 is missing a ; so is not an HTML character reference, but if it were, it would be decoded to 'a' before apply is called.

saaspeter commented 3 years ago

@yangbongsoo , you are right, for my cast the original string will be sanitized, but I take the original string if there is no attacked string in it, so when the StandardUrlProtocols failed the detection, I use the original string :( , what's more I don't want my string was sanitized if there is no risk in the input string. Anyway, I think if the href is not a standard protocal url, StandardUrlProtocols validation should fail. for my case I wrote a new enhanced policy to replace StandardUrlProtocols.
thanks