Open evlist opened 10 years ago
The default configuration is currently defined in XMLUtils.java as a static final variable:
public static final ParserConfiguration PLAIN = new ParserConfiguration(false, false, false);
This variable is used to parse XML documents before the resource manager has been set and properties.xml can be read and using methods from org.orbeon.oxf.properties.Properties.instance().getPropertySet()
will fail until properties.xml has been processed.
A solution is to replace this variable by a static method that tries to access these properties until success and return a default property before that, for instance:
public static final String DEFAULT_XMLPARSER_VALIDATING_PROPERTY = "oxf.xml-parser.validating";
public static final String DEFAULT_XMLPARSER_HANDLE_XINCLUDE_PROPERTY = "oxf.xml-parser.handle-xinclude";
public static final String DEFAULT_XMLPARSER_EXTERNAL_ENTITIES_PROPERTY = "oxf.xml-parser.external-entities";
private static ParserConfiguration defaultConfiguration = null;
public static ParserConfiguration getDefault() {
if (defaultConfiguration != null) {
return defaultConfiguration;
}
ParserConfiguration config;
try {
config = new ParserConfiguration(
org.orbeon.oxf.properties.Properties.instance().getPropertySet().getBoolean(DEFAULT_XMLPARSER_VALIDATING_PROPERTY),
org.orbeon.oxf.properties.Properties.instance().getPropertySet().getBoolean(DEFAULT_XMLPARSER_HANDLE_XINCLUDE_PROPERTY),
org.orbeon.oxf.properties.Properties.instance().getPropertySet().getBoolean(DEFAULT_XMLPARSER_EXTERNAL_ENTITIES_PROPERTY));
} catch (Exception e) {
return new ParserConfiguration(false, false, false);
}
defaultConfiguration = config;
return config;
}
The corresponding properties can then be set in properties-local.xml, for instance:
<property as="xs:boolean" name="oxf.xml-parser.external-entities" value="true"/>
I moved this to a branch default-parser-configuration-1614
as the commit broke the build.
af5be53 should fix issue with the build, sorry!
Now the question is whether we should do it this way ;)
One drawback I see is that if you change the default configuration for the whole platform, then you open for example Ajax requests to potential security issues. In fact that's why external entities were disabled by default.
So wouldn't it be better to be able to enable this more selectively?
Yes, ideally that should be switchable for a specific task such as a XSLT transformation.
An option would be to add new attributes used by XSLTTransformer to choose a different parser configuration when it creates its URIResolver.
Do you want me to try that?
Sure! TransformerURIResolver
is already passed a configuration, including from XSLTTransformer
.
This could be done in two ways:
Similarly to what we do for other fairly global configurations:
<property
as="xs:string"
processor-name="oxf:builtin-saxon"
name="location-mode" value="none"/>
The default parser configuration is currently to ignore XInclude and external entities and do not attempt to perform validation.
This is an issue for cases where this behavior is not expected and cannot be overridden by an explicit URL Generator config.
An example is when using the DocBook XSL Stylesheets to convert a document into PDF: some of the imported stylesheets use external entities and the transformation cannot succeed if external entities are switched off.
To solve this issue, a set of properties should be added to define the default values of parser configurations.