darylldoyle / svg-sanitizer

A PHP SVG/XML Sanitizer
GNU General Public License v2.0
456 stars 68 forks source link

Removing DOCTYPE breaks entities #30

Closed alex40724 closed 4 years ago

alex40724 commented 4 years ago

Hi,

not sure if I am doing anything wrong here. The sanitizer removes the DOCTYPE which breaks entities being used, e.g. in this adobe export file. After sanitizing this the file and opening directly in a browser, it produce errors like "Entity 'ns_extend' not defined".

before

<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 17.0.0, SVG Export Plug-In . SVG Version: 6.00 Build 0)  -->
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd" [
    <!ENTITY ns_extend "http://ns.adobe.com/Extensibility/1.0/">
    <!ENTITY ns_ai "http://ns.adobe.com/AdobeIllustrator/10.0/">
    <!ENTITY ns_graphs "http://ns.adobe.com/Graphs/1.0/">
    <!ENTITY ns_vars "http://ns.adobe.com/Variables/1.0/">
    <!ENTITY ns_imrep "http://ns.adobe.com/ImageReplacement/1.0/">
    <!ENTITY ns_sfw "http://ns.adobe.com/SaveForWeb/1.0/">
    <!ENTITY ns_custom "http://ns.adobe.com/GenericCustomNamespace/1.0/">
    <!ENTITY ns_adobe_xpath "http://ns.adobe.com/XPath/1.0/">
]>
<svg version="1.1" id="Layer_1" xmlns:x="&ns_extend;" xmlns:i="&ns_ai;" xmlns:graph="&ns_graphs;"
     xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px" width="32px" height="32px"
     viewBox="0 0 32 32" enable-background="new 0 0 32 32" xml:space="preserve">
...
</svg>

after

<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 17.0.0, SVG Export Plug-In . SVG Version: 6.00 Build 0)  -->
<svg version="1.1" id="Layer_1" xmlns:x="&ns_extend;" xmlns:i="&ns_ai;" xmlns:graph="&ns_graphs;"
     xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px" width="32px" height="32px"
     viewBox="0 0 32 32" enable-background="new 0 0 32 32" xml:space="preserve">
...
</svg>
darylldoyle commented 4 years ago

Hi @alex40724,

I understand that this is an issue with entities, but unfortunately removing the doctype is the only surefire way to protect against a lot of XML attacks, including XML entity expansion attacks and therefore I have no resolution for this issue.

I'm sorry that's not much help to you, but it's the only way I can see to do this.

alex40724 commented 4 years ago

Hi @darylldoyle,

thanks for the answer.

I wonder why this is not an issue for others. Adobe products are widely used and embedding svg in HTML pages should be a common use case, too.

There should be ways to remove the DOCTYPE and keep the file valid by resolving the entities / replacing the references in the attributes.

Currently I do not have the time to provide a PR for this, maybe later...

LetsRumpel commented 2 years ago

Same issue in a large corporate website, causing serious problems. Killing all dogs helps against rabid dogs ... But what about the sled dogs?