cure53 / DOMPurify

DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:
https://cure53.de/purify
Other
13.87k stars 713 forks source link

Sanitize method hang the browser while removing the unwanted tags. #365

Closed schandra-rpx closed 5 years ago

schandra-rpx commented 5 years ago

This issue proposes a bug which...

Background & Context

I have a problem while cleaning large externally sourced HTML code into the web application into wyswig editor, I use the sanitize method to clear unwanted scripts and tags.

Bug

sanitize method hang the browser while removing the unwanted tag.

Input

example 1: var text = <table><tbody><tr><td >test</td></tr></tbody></table> DOMPurify.sanitize(text, {FORBID_TAGS: ['tbody']})

example 2: var text = <table><colgroup><col></col></colgroup><tbody><tr><td >test</td></tr></tbody></table>

DOMPurify.sanitize(text, { ALLOWED_TAGS: [ 'b', 'strong', 'i', 'italic', 'div', 'p', 'span', 'ul', 'li', 'ol', 'a', 'img', 'br', 'tr', 'td', 'th', 'table', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6'], ALLOW_DATA_ATTR: false, ALLOWED_ATTR: ['src', 'class', 'target', 'href'] });

https://jsfiddle.net/02pveyhq/

Given output

browser hangs.

Expected output

for example 1: expected HTML output: <table><tr><td >test</td></tr></table>

for example 2: expected HTML output: <table><tr><td >test</td></tr></table>

cure53 commented 5 years ago

Heya, thanks for reporting.

The problem here is that the browser will automatically add the TBODY. So by forbidding it, DOMPurify will first remove it and then the browser will add it again - and so on and so on.

The problem was discussed in #286 already.

What we cannot do: Force the browser to remove the TBODY and be fine with it. What we can do: Add some code that prevents the loop, but TBODY would still be there.

What do you think is best here?

schandra-rpx commented 5 years ago

Hi, thanks for the reply. infinite loop is the actual issue I'm concern about. It will be okay if the sanitize method responses back with legal tags, also please take a look at example 2 as well. in which I believe colgroup tag is also leading to an infinite loop

cure53 commented 5 years ago

Yup, both problems have the same root cause. I have a fix that addresses them both, it basically looks like this:

    /* Add tbody to ALLOWED_TAGS in case tables are permitted, see #286 */
    if (ALLOWED_TAGS.table) {
      addToSet(ALLOWED_TAGS, ['tbody']);
      delete FORBID_TAGS.tbody;
    }

    /* Add colgroup to ALLOWED_TAGS in case cols are permitted, see #286 */
    if (ALLOWED_TAGS.col) {
      addToSet(ALLOWED_TAGS, ['colgroup']);
      delete FORBID_TAGS.colgroup;
    }

If tables are allowed, TBODY must be allowed too - we had that code before. But, if TBODY is forbidden, we would end up in a loop - so we need to remove it from the forbidden tags.

Same for COL and COLGROUP. It eliminates the DoS but it adds behavior that is unexpected - forbid something but then it comes back.. reasonable?

schandra-rpx commented 5 years ago

I think it is something else, I tried with a fix which you have provided, for example 2, 'col' is not listed in allowed tags, so the second if-condition (ALLOWED_TAGS.col) is false, so it is leading to infinite loop again.

later I tried like below which actually worked.

      /* Add tbody to ALLOWED_TAGS in case tables are permitted, see #286 */
      if (ALLOWED_TAGS.table) {
        addToSet(ALLOWED_TAGS, ['tbody']);
        addToSet(ALLOWED_TAGS, ['colgroup']);
        addToSet(ALLOWED_TAGS, ['col']);
        delete FORBID_TAGS.tbody;
        delete FORBID_TAGS.colgroup;
        delete FORBID_TAGS.col;
      }

also i have a feeling, this issue might rise again for some other tags as well, I will be gratefull if you can suggest/provide the permanent solution for the infinite loop.

and, kindly provide a new release, after the fix is available

cure53 commented 5 years ago

I think I did fond a solution that prevents the DoS and isn't as agressive as the proposal you show above (we cannot override a user's config that heavily).

My test cases yield no DoS any longer, please give it a try.

I will be gratefull if you can suggest/provide the permanent solution for the infinite loop.

Sadly, that is not possible. There might always be some crazy config options that overlap with browser behavior that might lead to a loop.

cure53 commented 5 years ago

Closing as assumed fixed