Consider not using dangerouslySetInnerHTML

sebranly commented 4 years ago

Thank you for this package, I came across it following this issue: https://github.com/google/react-schemaorg/issues/9, as I'm still learning about dangerouslySetInnerHTML and XSS.

I noticed that this package uses dangerouslySetInnerHTML:

      <script
        type="application/ld+json"
        dangerouslySetInnerHTML={{
          __html: JSON.stringify(this.props.item, safeJsonLdReplacer, this.props.space)
        }}
      />

In order to modify head on the website I'm working on, I use https://github.com/staylor/react-helmet-async which is an improved fork of https://github.com/nfl/react-helmet, for React 16+.

I noticed https://github.com/nfl/react-helmet readme has an example for JSON-LD which doesn't use dangerouslySetInnerHTML:

<Helmet>
    {/* inline script elements */}
    <script type="application/ld+json">{`
        {
            "@context": "http://schema.org"
        }
    `}</script>
</Helmet>

(please note that I added the first and last lines to make the example shorter)

I suppose in the case of https://github.com/google/react-schemaorg, such a syntax (not using dangerouslySetInnerHTML) wasn't followed because it has constraints:

it requires to call renderStatic on the server for https://github.com/nfl/react-helmet, or
it requires to provide HelmetProvider for https://github.com/staylor/react-helmet-async

I suppose https://github.com/google/react-schemaorg is aiming for simplicity and not using any Helmet package for this reason.

If within a project we have the choice then what would you recommend please?

Option a)

      <script
        type="application/ld+json"
        dangerouslySetInnerHTML={{
          __html: JSON.stringify(this.props.item, safeJsonLdReplacer, this.props.space)
        }}

Option b)

    <Helmet>
      <script type="application/ld+json">{JSON.stringify(this.props.item, safeJsonLdReplacer, this.props.space)}</script>
    </Helmet>

Thank you in advance for your time!

sebranly commented 4 years ago

Please note that I got some inspiration from this other issue: https://github.com/nfl/react-helmet/issues/333#issuecomment-616871673

Eyas commented 4 years ago

Yeah this package isn't particularly compatible with Helmet.

We need dangerouslySetInnerHtml for our use case because react by default parses the contents of <script> as text, meaning that characters like { and } will be escaped.

Helmet goes around that, which is why this works for them, but not for vanilla react.

It seems like part of your ask is to export safeJsonLdParser so that someone using Helmet can use it?

Eyas commented 4 years ago

For helmet, I often found that using its attributes section is easier, e.g.:

<Helmet
  htmlAttributes={{ lang: "en" }}  //Example,
  title="..."
  script={[
    {
      type: "application/ld+json",
      innerHTML: JSON.stringify(graph)
    },
   // ...
  ]}
/>

Only thing is that you'd need to use a safe JsonLD replacer in that JSON.stringify as well.

It might make sense to have a flavor of react-schemaorg that returns a JSX.IntrinsicElements["script"] here.

sebranly commented 4 years ago

Thank you for the quick answers! I was writing this issue while trying to understand XSS vulnerabilities more in-depth in the context of LD-JSON script tags. I think part of my ask would become exporting safeJsonLdParser yes, unless this package you're maintaining offers a prop to customize either using the current solution (dangerouslySetInnerHTML) or the Helmet solution. But maybe this package has to stay vanilla react because of size of dependencies or other factors?

I'm also curious whether there was a reason for using safeJonLdParser rather than using a package such as https://github.com/cure53/DOMPurify which exposes a sanitize function please.

In your latest example, are you suggesting using a replacer as well because innerHTML is dangerous such as dangerouslySetInnerHTML? (https://zhenyong.github.io/react/tips/dangerously-set-inner-html.html)

Eyas commented 4 years ago

Re: DOM Sanitize (and I think it also implicitly answers some of the other questions): content inside of these script tags is JSON-LD which by spec has its own definition of valid characters and what sanitization looks like. You can see this in the text of the commit message that implemented the XSS fix.

We're not really escaping HTML entities or sanitizing actually HTML/DOM content per se.

But yes, if there's another library just to do this, we can use it. That said, the spec is small enough that it's fairly reasonable to implement inline.

Eyas commented 4 years ago

Check out the readme, out of 1.1.0 there's now native support for Helmet via the script={[...]} props.

google / react-schemaorg

Consider not using dangerouslySetInnerHTML #16