Open slarse opened 4 years ago
After thinking this through over the weekend, I think I may have been overthinking it. It should be fully sufficient to replace every occurrence of DocumentBuilderFactory.newInstance()
with createDocumentBuilderFactory()
. We don't have to worry about whether or not it's related to fields or not.
As it turns out, the "major variations" outlined in the opening comment in this issue don't really matter. As explained in #201, all violations related to DocumentBuilderFactory.newInstance()
can be solved by substituting the call to newInstance()
for a call to a method that creates a "safe" factory. With #202, I've shown that it generalizes easily to at least TransformerFactory
, and with some small tweaks it show work for most of the other libraries.
Fields, however, don't appear to work. The solution of method invocation substitution applies to fields as well, but Sonar does not flag them as violations, and so we can't detect them.
Here's the roadmap for implementing support, in the order in which the libraries are listed in rule 2755:
setAttribute(String, Object)
to set attributessetProperty(String, Object)
to set attributessetProperty(String, Object)
to set attributesACCESS_EXTERNAL_STYLESHEET
instead of ACCESS_EXTERNAL_SCHEMA
to be setsetAttribute(String, Object)
to set attributessetProperty(String, Object)
to set attributesSchemaFactory
setProperty(String, Object)
to set attributesnewInstance()
. Not sure if it matters.Depends on https://github.com/INRIA/spoon/pull/3702, which will be merged soon.
FYI, for the big picture about XML and security:
Coordinated disclosure of XML round-trip vulnerabilities in Go’s standard library https://mattermost.com/blog/coordinated-disclosure-go-xml-vulnerabilities/
Thanks @monperrus, I'll have a look
This is related to introducing a processor for XML parsers should not be vulnerable to XXE attacks. It's a rather complicated processor, so I'm introducing gradually. As can be noted, it covers a variety of libraries, but there are only a couple of major variations in usage.
I'm initially targeting only the
DocumentBuilderFactory
use cases to see if this is even feasible to repair adequately. I've found 6 major variations to what it can look like. Here's an example repair of the simplest case:The cases I've found are:
DocumentBuilderFactory
is declared and initialized. See example transformation above.DocumentBuilder
is created by chainingDocumentBuilderFactory.newInstance().newDocumentBuilder()
.DocumentBuilderFactory.newInstance().newDocumentBuilder().parse("xmlfile.xml")
Currently, I've solved case 1 with #191 . It's rather simple and requires only the addition of a few statements. Cases 2-4 require rewriting the existing code. I think the simplest way to do this is to replace the statements with a method invocation. For example, for case 2 and 3, one could substitute
DocumentBuilderFactory.newInstance().newDocumentBuilder()
forcreateDocumentBuilder()
, which is defined like so.Cases 4-6 can all be solved by creating such a method, and substituting all initializations of the field with an invocation of said method.