HubSpot / draft-convert

Extensibly serialize & deserialize Draft.js ContentState with HTML.
Apache License 2.0
483 stars 94 forks source link

Fix double escaping of entities #92

Open amannn opened 7 years ago

amannn commented 7 years ago

Fixes #47.

The double escaping occurs, because there's a call to encodeBlock at the beginning followed by a call to renderToStaticMarkup if a element is provided that has children.

So basically there are two options to fix this:

  1. Only escape text that is not passed to to a renderToStaticMarkup call.
  2. Unescape the resulting markup from renderToStaticMarkup.

To me the first option would have been a bit cleaner, but surfaced some issues for me:

So a proper way to go with the first option might have led to a bigger change to the code base which I wanted to avoid. Therefore I went for the second option, which seems to work pretty well. The unescape helper is pretty small (1.2K minified, non-gzipped) so I hope this is ok for you.

I encountered this bug recently, so I thought I'll provide a fix for it. Let me know what you think! 🙂

benbriggs commented 7 years ago

Thank you for the thoughtful PR and description! I think in general we're going to need to do something like this. My main concern is how this affects the the rest of the HTML output such as attributes. For example, in one of your tests if you added an attribute like this (either intentionally or via malicious user input):

const Link = ({ children }) => <a href="&quot;&gt;&lt;/a&gt;&lt;script&gt;alert(&quot;hi&quot;)&lt;/script&gt;&lt;a style=&quot;">{children}</a>;

this would get unescaped out to be regular HTML characters:

<a href=""></a><script>alert("hi")</script><a style="">Led Zeppelin - Since I&#x27;ve Been Loving You</a>

I'm not quite sure how to get around this outside of re-parsing the HTML to DOM elements and doing the unescaping on each text node. I imagine there'd be a performance hit with that strategy. Does this make sense?

amannn commented 7 years ago

Thanks for the quick reply!

Oh, you're right – that's indeed an issue. Hmm, I think your proposed solution would work, but as you mentioned this could decrease performance a bit and potentially also increase complexity for a rather edge case I guess.

Is there from your perspective a good way of only escaping text that doesn't get passed to a renderToStaticMarkup call (solution 1 from above)?