Closed ericsun95 closed 2 years ago
Yes, it's needed to parse XSDs, IIRC. I believe DTD parsing is disabled. Can you be more specific about the threat here, ideally proposing some change that would lock down parsing more?
Take this as an example, https://github.com/apache/ws-xmlschema/blob/f08d3a339fd546878b6848c8987fb9cb05fbb1c8/xmlschema-core/src/main/java/org/apache/ws/commons/schema/XmlSchema.java#L885-L886, it's better to specify attributes like these to prevent XML External Entity (XXE).
trFac.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
trFac.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
Similarly for the DocumentBuilderFactory
etc. For the xmlInputFactory
we are using, probably do something like this?
factory.setProperty(XMLInputFactory.SUPPORT_DTD, false); // disable external entities
factory.setProperty("javax.xml.stream.isSupportingExternalEntities", false);
In this project, XSDs are only used for validation against the schema rather than transforming XML. It looks like the two properties you mention can be set on SchemaFactory. Hm, I'm wondering whether we should set those, because I could imagine legitimately wanted to let an XSD Schema refer to external resources in this use case. An XSD Schema in this case would be controlled by the 'caller', not from a bunch of XML some application is ingesting, so I think the risk of some issue is smaller - if you can corrupt an XSD Schema file somewhere internally you can probably do quite a bit more.
For parsing of XML I agree this is an issue because the XML passing through some job using this to read it might be from untrusted sources. Both of the properties you suggest are already set.
Yeah, I see a similar thread for this. Would you be able to release a new version for that?
Eventually, yes, usually every 3-6 months. If there's no other action for the XSD transformer I'll close this, but open to arguments about changing that too.
After version 0.10.0 we introduce a new dependency https://mvnrepository.com/artifact/org.apache.ws.xmlschema/xmlschema-core to the project. However, that package use
TransformerFactory
,DocumentBuilderFactory
etc in an unsafe way that is vulnerable to be attached. Those weakly configured XML Parser will allow attacker injected untrusted data into application. The safest way to prevent XXE is always to disable DTDs (External Entities) completely. Do we truly need to rely on that dependency? Do we have alternatives?