earthoutreach / pykml

Automatically exported from code.google.com/p/pykml
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

[security] Potential for XXE-type exploits when parsing untrusted documents #37

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

It may be possible to leak local files or make network requests when processing 
a malicious document, per 
https://www.owasp.org/index.php/XML_External_Entity_%28XXE%29_Processing

The default options to the lxml parser are not suitable for use on untrusted 
inputs, and pykml.parser does not expose them for reconfiguration.

See PoC/example below.

What is the expected output? What do you see instead?

parser.parse() should probably default to using a lxml.Parser instance with 
resolve_entities=False (and maybe no_network=True) to avoid malicious entity 
expansion.

http://lxml.de/parsing.html#parser-options details the available options.

What version of the product are you using? On what operating system?

OSX 10.9.1
Python 2.7.6 (MacPorts)
pykml==0.1.0

Please provide any additional information below.

# Simple PoC from OWASP sample document.
from lxml import etree
from pykml import parser
doc = parser.fromstring('<?xml version="1.0" encoding="UTF-8"?>'
                        '<!DOCTYPE foo [ <!ELEMENT foo ANY > '
                        '<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>'
                        '<foo>&xxe;</foo>')
print etree.tostring(doc, pretty_print=True)

---

Mitigation:

I'm unaware of any KML documents that actually use entities, so just modifying 
the parse/fromstring functions to use a parser with expand_entities disabled 
may be sufficient. 

An alternative may be to add new methods parse_safe() or add additional 
optional kwargs to the existing methods to allow users to provide their own 
Parser object, or set options on it.

If anything other than the first option, the docs should be updated with a 
prominent warning about the risks of handling untrusted input without 
precautions.

An example kml input that also passes kml22gx.xsd schema validation is attached.

Original issue reported on code.google.com by shab...@gmail.com on 8 Feb 2014 at 2:26

Attachments: