Closed ohader closed 2 years ago
@ohader thanks for this. Do you have any concerns with leaving the escaped <img>
in place? It's something that I was struggling with beforehand.
Escaped text is safe. If it wasn't, you'd have to remove or sanitize all text nodes.
Serializing CDATA sections as CDATA sections is not safe in HTML since it's sometimes parsed as a bogus comment.
@zcorpan Thanks for pointing that out. I just discovered your remarks in https://github.com/whatwg/html/issues/4016 from 2018 as well... 😳
So basically CDATA
content would have to be sanitized from HTML tags (removing non-SVG tags), e.g. by
CDATA
content separately as fragment and remove non-SVG tags,<
and >
to <
and >
entities, keeping quotes untouched hereCDATA
followed by element not in HTML namespace<defs>
<style type="text/css"><![CDATA[
circle { fill: gold; }
]]></style>
</defs>
<![CDATA[ .. ]]>
<![CDATA[ .. ]]>
seems to be fineCDATA
followed by an element being in HTML namespace<p/><![CDATA[ ><img src onerror=alert(3)> just-harmless-text]]>
<img>
is a non-SVG element which either would have to be removed or encoded to entities<![CDATA[ .. ]]>
sectionCDATA sections are now converted to text nodes, containing encoded content only... and basically that's it.
This approach is much simpler than trying to infer context and parser state, just for the sake of keeping those <![CDATA[
literals.
This text node conversion now is more explicit.
Yeah, always replacing CDATA section nodes with text nodes is what I would suggest.
@darylldoyle Can you merge this PR please and tag a new 0.15.4
version? Thanks in advance!
@darylldoyle Awesome! Thanks a bunch! 👍
Recent change disallowed CDATA sections, however the actual fix would have been to disallow non SVG-elements when used inline in some HTML-context.
Resolves: #70