darylldoyle / svg-sanitizer

A PHP SVG/XML Sanitizer
GNU General Public License v2.0
454 stars 67 forks source link

Requesting more details on GHSA-fqx8-v33p-4qcc (CVE-2022-23638) #71

Closed ohader closed 1 year ago

ohader commented 2 years ago

It seems tag 0.15.0 addressed a security vulnerability, see corresponding advisory https://github.com/darylldoyle/svg-sanitizer/security/advisories/GHSA-fqx8-v33p-4qcc (CVE-2022-23638)

Corresponding commit at https://github.com/darylldoyle/svg-sanitizer/commit/17e12ba9c2881caa6b167d0fbea555c11207fbb0 contains a new test case tests/data/htmlTest.svg.

Invoked as svg.svg in browser, mime-type image/svg+xml

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="-1 -1 2 2">
    <!--><img src onerror=alert(1)><!-->
    <?x ><img src onerror=alert(1)><?x?>
    <p/><![CDATA[ ><img src onerror=alert(1)> ]]>
    <font face=""/><![CDATA[ ><img src onerror=alert(1)> ]]>
</svg> 

→ no problem since <img> is not a SVG element -> not a vulnerability

Invoked as svg.html in browser, mime-type text/htm

<html>
<body>
<div>
    <svg xmlns="http://www.w3.org/2000/svg" viewBox="-1 -1 2 2">
        <!--><img src onerror=alert(1)><!-->
        <?x ><img src onerror=alert(1)><?x?>
        <p/><![CDATA[ ><img src onerror=alert(1)> ]]>
        <font face=""/><![CDATA[ ><img src onerror=alert(1)> ]]>
    </svg>
</div>
</body>
</html>

→ valid concern, since HTML is used in inline SVG → scripts are executed in browser → cross-site scripting vulnerability

Conclusion & Post-review

Request

darylldoyle commented 2 years ago

@ohader you're correct. The concern here was that any of the following tags will cause the HTML parser to break out of foreign content and back into HTML.

A start tag whose tag name is one of: "b", "big", "blockquote", "body", "br", "center", "code", "dd", "div", "dl", "dt", "em", "embed", "h1", "h2", "h3", "h4", "h5", "h6", "head", "hr", "i", "img", "li", "listing", "menu", "meta", "nobr", "ol", "p", "pre", "ruby", "s", "small", "span", "strong", "strike", "sub", "sup", "table", "tt", "u", "ul", "var" A start tag whose tag name is "font", if the token has any attributes named "color", "face", or "size" An end tag whose tag name is "br", "p"

As per: https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inforeign

Specifically this line was causing me issues:

<p/><![CDATA[ ><img src onerror=alert(1)> ]]>

DOMDocument was picking the img tag up as a DOMComment node with the following content, which led me to remove comments and CDATA so that this got stripped.

><img src onerror=alert(1)>

I'm happy to consider adding CDATA back in, but we need to be sure that there's no easy bypass. Maybe we should consider extending the tests in this area.