darylldoyle / svg-sanitizer

A PHP SVG/XML Sanitizer
GNU General Public License v2.0
464 stars 68 forks source link

Requesting more details on GHSA-xrqq-wqh4-5hg2 (CVE-2023-28426) #88

Closed ohader closed 1 year ago

ohader commented 1 year ago

It seems that tag 0.16.0 did not actually fix a real security vulnerability, see corresponding advisory GHSA-xrqq-wqh4-5hg2 (CVE-2023-28426).

The provided new test SVG files in commit cce18bc237c05c6e093e9672db7926788da9b322 still seem to be fine (encoded correctly) even when being processed with the previous version 0.15.4 of this library.

Invoked as svg.svg in browser, mime-type image/svg+xml

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" xml:space="preserve">
    <form action="javascript:alert('1')">
        <input type="submit"></input>
    </form>

        &lt;form action="javascript:alert('1')"&gt;
            &lt;input type="submit" onclick="javascript:alert('1')"/&gt;
        &lt;/form&gt;

</svg>

→ the <form> tag does not look nice, but is without any functionality inside an SVG context → the nested <form> tag in a CDATA section is correctly encoded (this is what the security advisory is referring to) → not a vulnerability

Invoked as svg.html in browser, mime-type text/htm

<html>
<body>
<div>
    <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" xml:space="preserve">
        <form action="javascript:alert('1')">
            <input type="submit"></input>
        </form>

            &lt;form action="javascript:alert('1')"&gt;
                &lt;input type="submit" onclick="javascript:alert('1')"/&gt;
            &lt;/form&gt;    

    </svg>
</div>
</body>
</html>

→ the <form> tag does not look nice, but is without any functionality inside an SVG context → the nested <form> tag in a CDATA section is correctly encoded (this is what the security advisory is referring to) → not a vulnerability

Conclusion & Post-review

Disclaimer & Call for Action

darylldoyle commented 1 year ago

@ohader I can see your point, the <form> issue may not be able to be executed but the background CSS issue does still stand.

If you run the following through 0.15.4, with the following settings, the JS will get executed.

<?xml version="1.0" standalone="yes"?>
<svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1">
    <style type="text/css">
        <![CDATA[
            s {
                background : "<textarea></textarea><iframe/srcdoc=&lt;script/src=data:,try{parent.document.write(&#x27;\x3cb\x3eLocation:\x3c/b\x3e&#x27;+parent.location+&#x27;\x3cbr\x3e\x3cb\x3eUserAgent:\x3c/b\x3e&#x27;+navigator.userAgent+&#x27;\x3cbr\x3e\x3cb\x3eCookie:\x3c/b\x3e&#x27;+document.cookie+&#x27;\x3cbr\x3e\x3cb\x3eLocalStorage:\x3c/b\x3e&#x27;+JSON.stringify(localStorage)+&#x27;\x3cbr\x3e&#x27;)}catch(e){parent.document.write(e.message)}&gt;&lt;/script&gt;321></iframe>";
            }
        ]]>
    </style>
</svg>

Attempt to sanitize it with the following settings:

<?php
$sanitizer = new enshrined\svgSanitize\Sanitizer();
$sanitizer->removeRemoteReferences(true);
$sanitizer->minify(false);
?>
<html>
    <body>
        <?php echo $sanitizer->sanitize(file_get_contents('svg.svg')); ?>
    </body>
</html>

This is similar to what I was running previously on https://svg.enshrined.co.uk/ where the vulnerability was reported.

ohader commented 1 year ago

@darylldoyle Thanks for your feedback.

I'm referring to the green tests in PR #89 and https://github.com/darylldoyle/svg-sanitizer/pull/89/files#diff-a24025b3103aa5c0c40acacec37ad73bfa7621ff439a9984f7bc0e97ce78dfbf (tests/data/cdataTwoClean.svg) as shown in the screenshot (I cannot link the source diff view of a PR directly on GitHub):

Screenshot 2023-03-21 at 16 44 17

I observed the same behavior with our TYPO3 CI tests (which is still using v0.15.4 of the library): https://review.typo3.org/c/Packages/TYPO3.CMS/+/78193/2/typo3/sysext/core/Tests/Functional/Resource/Fixtures/CleanSVG/cdataTwoTest.svg

The process at https://svg.enshrined.co.uk/ is wrapping the SVG result in a <textarea> block, where the correctly encoded SVG contents are visualized differently, but are not executed at all - example:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Textarea</title>
</head>
<body>
<textarea rows="30" cols="80">

<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="128px" height="128px" version="1.1">
  <style type="text/css">
            s {
                background : "";
                background : "&lt;textarea&gt;&lt;/textarea&gt;&lt;iframe/srcdoc=&amp;lt;script/src=data:,try{parent.document.write(&amp;#x27;\x3cb\x3eLocation:\x3c/b\x3e&amp;#x27;+parent.location+&amp;#x27;\x3cbr\x3e\x3cb\x3eUserAgent:\x3c/b\x3e&amp;#x27;+navigator.userAgent+&amp;#x27;\x3cbr\x3e\x3cb\x3eCookie:\x3c/b\x3e&amp;#x27;+document.cookie+&amp;#x27;\x3cbr\x3e\x3cb\x3eLocalStorage:\x3c/b\x3e&amp;#x27;+JSON.stringify(localStorage)+&amp;#x27;\x3cbr\x3e&amp;#x27;)}catch(e){parent.document.write(e.message)}&amp;gt;&amp;lt;/script&amp;gt;321&gt;&lt;/iframe&gt;";
            }
        </style>
</svg>

</textarea>
</body>
</html>

Visualized like in the screenshot below, but that's just the text-content representation - no IFRAME instance is generated in the DOM (see the document.querySelector('iframe') which resolves to null):

Screenshot 2023-03-21 at 17 07 03
darylldoyle commented 1 year ago

@ohader yep, I see what you mean. The issue with https://svg.enshrined.co.uk/ was that the <textarea></textarea> at the start of the background definition was breaking out of the TextArea that the clean SVG contents were output into, and that was then executing the <iframe> script within the HTML context of the page.

This does look like it's a false positive, and actually, there would be no issue reverting the HTMLPurifier addition.

Is the goal here just to reject the CVE, or is there a more significant reason to revert the HTMLPurifier addition? The library is pretty small.

ohader commented 1 year ago

In case we can guarantee that CDATA sections are correctly encoded - and I think they are, due to the changes of PR #72 which also had been confirmed back then by the reporter of CVE-2022-23638 - then using ezyang/htmlpurifier seems to be superfluous to me.

This library also has some side effects when trying to generate caches in read-only file systems, e.g. Warning: Directory ~/typo3/main/vendor/ezyang/htmlpurifier/library/HTMLPurifier/DefinitionCache/Serializer not writable, please chmod to 777 in ~/typo3/main/vendor/ezyang/htmlpurifier/library/HTMLPurifier/DefinitionCache/Serializer.php on line 297.

Rejecting the CVE helps those that currently are "blocked" by e.g. roave/security-advisories since they are still on v0.15.4 of this library. Thus, marking the CVE as invalid should also revert advisories like https://packagist.org/packages/enshrined/svg-sanitize/advisories?version=5989941 or https://github.com/advisories/GHSA-xrqq-wqh4-5hg2 - which are used as source for GitHub's Dependabot and similar tools as well.

Thus, for me the desired approach really would be like implemented in PR #89:

However, I'm also curious to see a short statement from the reporter @Cyxow concerning these findings.

ohader commented 1 year ago

A web archive search for https://svg.enshrined.co.uk/ shows a version from November 2022. It seems the <textarea> containing the "Dirty SVG" was not properly encoded back then and has been adjusted recently on the website. Most probably that was the XSS that has been discovered - thus, correct, there was a XSS on the website(!) in the "Dirty SVG" - however the output of "Clean SVG" passed through the svg-sanitizer was fine.

Open https://web.archive.org/web/20221104235950/https://svg.enshrined.co.uk/ and check the HTML source of the "Dirty SVG" section, it was not encoded back then. That was the trigger... 😉

Cyxow commented 1 year ago

Hello, @ohader! I agree, it was my mistake, I really see that the vulnerability is really only on the site. Thank you for your attention to this report. Also, I also think that it would be right to refuse CVE.

ohader commented 1 year ago

Thanks for all of your feedback, I've requested an update to the GitHub advisory database and to reject the assigned CVE.

zcorpan commented 1 year ago

The test probably should be something like this:

<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" xml:space="preserve">
  <desc>
    <form xmlns="http://www.w3.org/1999/xhtml" action="javascript:alert('1')">
        <input type="submit"/>
    </form>
  </desc>
</svg>

This would have the form in the DOM that, if submitted, would execute the script in action. Since desc is not rendered, the user can't access the button. However, depending on what the page is doing where this is injected, you can still make the form be submitted by clobbering an id. Let's say the victim page is a discussion board where posting a new comment involves clicking a "Comment" button:

<form id="commentform"></form>
...
<button form="commentform" type="submit">Comment</button>

or

<form id="commentform">
...
<a href="#" onclick="commentform.submit()" class="button">Comment</a>
</form>

If the injected SVG is earlier in the DOM and the form also has id="commentform", the user clicking "Comment" will submit the injected form and execute the attacker's script.

For CDATA, that's not an issue in XML context but can still be an issue in HTML context since the parsing rules are different in different places in HTML, which makes it possible to bypass. For example:

<svg xmlns="http://www.w3.org/2000/svg">
  <desc>
    <style><![CDATA[</style><img src onerror="alert(1)">]]></style>
  </desc>
</svg>

desc is an HTML integration point, so <style> above is parsed as an HTML style element (where CDATA sections are not supported).

zcorpan commented 1 year ago

Oh for CDATA you previously replaced with a text node. That should be safe also in HTML, no need to sanitize the text node.

ohader commented 1 year ago

Oh for CDATA you previously replaced with a text node. That should be safe also in HTML, no need to sanitize the text node.

Yepp, I also think it's fine. It has been changed in PR #72 and you've been commenting there as well, back then... 😄