cure53 / DOMPurify

DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:
https://cure53.de/purify
Other
13.67k stars 698 forks source link

DOMPurify.sanitize() leaves executable code #986

Closed pdelancie closed 3 weeks ago

pdelancie commented 3 weeks ago

Bug: an can action to be passed via a sanitized string to a tag that is built on the fly using that string.

Background & Context

We use DOMPurify.sanitize() to protect an input field used for search. Once the input has been sanitized it is displayed in a lozenge beneath the search field. In a recent pen test it was discovered that despite the input string being sanitized it was possible to pass a script into the tag for this lozenge element.

Bug

Input

<<qript>vAr onmouseover=alert(“SEVN-XSS”)>xss

Given output

Screenshot 2024-08-19 at 1 14 31 PM

<vAr onmouseover=alert(“SEVN-XSS”)>xss In the above the "<" and ">" are actually html entities (less than/greater than) but GitHub is not rendering them here.

In the rendered page this "var" tag becomes part of the following div, whose innerHTML should only be the string "xss" : Screenshot 2024-08-19 at 12 58 57 PM

Hovering over the div results in an alert whose text is "SEVN-XSS"

Expected output

xss

Feature

N.A.

cure53 commented 3 weeks ago

Hmm, this does not seem to be an issue on our end to be frank. Not sure what we are expected to fix, it feels like the implementation our your end is flawed.

Can you share more details what happens to that string?

pdelancie commented 3 weeks ago

Hi. Thanks for your response. I'm confused, though, because I thought that the point of DomPurify is to remove tags and code from a string. That does not seem to be happening in this case.

To answer your question, the string is used for the text -- "xss" -- in the lozenge pictured here:

Screenshot 2024-08-21 at 8 32 08 AM

And since DomPurify did not remove the tag surrounding the string, which includes an "onmouseover" attribute, when you hover on the text in the lozenge the browser opens an alert (as called for in the code that was not stripped out), as shown here:

Screenshot 2024-08-21 at 8 32 26 AM

pdelancie commented 3 weeks ago

@cure53 Could you please explain why this issue was closed without a response to the information above provided at your request? Thanks

cure53 commented 3 weeks ago

What DOMPurify correctly produces is this:

&lt;vAr onmouseover=alert(“SEVN-XSS”)&gt;xss

The application must then use it in such way that the entities get decoded, which causes the XSS. As far as I can see, there is nothing DOMPurify does wrong - but rather the application utilizing the sanitized result in an unsafe manner.

pdelancie commented 3 weeks ago

No, it's not the application. It's the browser, Firefox, that is interpreting the purified output and rendering it as "<vAr onmouseover=alert(“SEVN-XSS”)>xss<var>", so when you hover on that element the onmouseover javascript is executed. Since all browsers will do this there is nothing unique to my application about this issue.

Gigabyte5671 commented 3 weeks ago

Hi @pdelancie

I'm also unable to reproduce the issue. DOMPurify is successfully sanitizing the specified input.

How have you configured DOMPurify?
How are you adding the sanitized string to your DOM?

pdelancie commented 3 weeks ago

To reproduce, the string entered in your input field would have to be exactly as shown in my original issue submission (a) in the Input field, and (b) in the searchText variable in the Watch Expressions part of the first screenshot: <<qript>vAr onmouseover=alert(“SEVN-XSS”)>xss

From this string, DomPurify is removing only <qript>. The purified string still contains a "var" opening tag -- defined with &lt; and &gt; -- that includes a mouseover attribute that has JS as it's value (&lt;vAr onmouseover=alert(“SEVN-XSS”)&gt;xss). The browser is then auto-completing this with a closing tag that makes the mouseover value executable.

This is the vulnerability that I am bringing up to you: a string that is not reconized by DOMPurify as including code can slip through because the browser will automatically turn it into something executable. Ideally after being purified the remaining string would instead be only "xss".

DOMPurify is in default config. The specific line including the purified string -- as the variable searchTerm -- into the DOM is as follows: let termDiv = '<div class="TermDiv" id="Term' + stringNum + '" onclick="Search_SetCurrent(\'' + stringNum + '\')">' + searchTerm + '</div>';

Overall, the purified string is included in element as follows (pseudocode): // create a new div for the tag // add the div into div StringTags // set tag attributes and values // put tag parts into tag <-- includes the termDiv div created in code above

cure53 commented 3 weeks ago

Either way, this in very likely not a DOMPurify bug :smile:

It seems that what you are doing in your code example is concatenating HTML snippets to a full HTML string. And you sanitize snippets of that HTML string. DOMPurify sanitizes HTML, not fragments of strings that might later end up in HTML.

What you want to do here is encode or escape, but not sanitize, me thinks.

pdelancie commented 3 weeks ago

BTW, in case it helps, the below is what I'm doing to address this: stripping out of the purified string anything that the browser could still recognize as an opening tag so that it won't auto-complete it and create a tag in the rendered DOM.

    // find and remove tags that are defined using characters
    while(rawString.indexOf('<') > -1 && rawString.indexOf('>') > -1 && rawString.indexOf('<') < rawString.indexOf('>')) {
        //remove the first tag in the string
        rawString = rawString.replace(rawString.slice(rawString.indexOf('<'), rawString.indexOf('>') + 1), '') ;
    } // loop
    // find and remove tags that are defined using html entities
    while(rawString.indexOf('&lt;') > -1 && rawString.indexOf('&gt;') > -1 && rawString.indexOf('&lt;') < rawString.indexOf('&gt;')) {
        //remove the first tag in the string
        rawString = rawString.replace(rawString.slice(rawString.indexOf('&lt;'), rawString.indexOf('&gt;') + 4), '') ;
    } // loop
    while(rawString.indexOf('&#60;') > -1 && rawString.indexOf('&#62;') > -1 && rawString.indexOf('&#60;') < rawString.indexOf('&#62;')) {
        //remove the first tag in the string
        rawString = rawString.replace(rawString.slice(rawString.indexOf('&#60;'), rawString.indexOf('&#62;') + 5), '') ;
    } // loop
    while(rawString.indexOf('&#x3c;') > -1 && rawString.indexOf('&#x3e;') > -1 && rawString.indexOf('&#x3c;') < rawString.indexOf('&#x3e;')) {
        //remove the first tag in the string
        rawString = rawString.replace(rawString.slice(rawString.indexOf('&#x3ec'), rawString.indexOf('&#x3e;') + 6), '') ;
    } // loop
pdelancie commented 3 weeks ago

Are you saying that DOMPurify is not intended to be used to protect against an input field being used to introduce executable code? I thought that was the point of it.

Also, isn't it always the case that a purified string undergoes some manipulation in order to present it in the page? That's all I'm doing, not changing the purified string at all, just wrapping it and inserting it. It's the browsers that are auto-completing it, turning it into a complete tag because it includes an opening tag that hasn't been removed by DOMPurify.

My point is that if the purified string below is being recognized by a browser as html then effectively DOMPurify is not removing all executable code from its output: &lt;vAr onmouseover=alert(“SEVN-XSS”)&gt;xss

cure53 commented 3 weeks ago

Are you saying that DOMPurify is not intended to be used to protect against an input field being used to introduce executable code?

Yep. We sanitize actual HTML, not string snippets that might arrive in HTML later on. Our job is to receive a string of potentially dirty HTML and then clean it from all things that might cause harm.

pdelancie commented 3 weeks ago

I guess we might have different understandings of what is "actual HTML." But -- assuming that the HTML tag represented by termDiv in the code below, in which searchTerm would be the entered string, would be considered as actual HTML -- I'll try purifying that, and see what happens. let termDiv = '<div class="TermDiv" id="Term' + stringNum + '" onclick="Search_SetCurrent(\'' + stringNum + '\')">' + searchTerm + '</div>';

cure53 commented 3 weeks ago

I think the most interesting question here is, how and why does the application transform the entities into their canonical form again, because that is what causes the XSS.

DOMPurify does what it is supposed to:

It finds a weird element, it removes it, what stays over is technically a safe string, with properly encoded special chars even so (the browser does that) and then, from that, a new attribute is being injected by the vulnerable application.