orbeon / orbeon-forms

Orbeon Forms is an open source web forms solution. It includes an XForms engine, the Form Builder web-based form editor, and the Form Runner runtime.
http://www.orbeon.com/
GNU Lesser General Public License v2.1
514 stars 221 forks source link

Add properties for the default parser configuration #1614

Open evlist opened 10 years ago

evlist commented 10 years ago

The default parser configuration is currently to ignore XInclude and external entities and do not attempt to perform validation.

This is an issue for cases where this behavior is not expected and cannot be overridden by an explicit URL Generator config.

An example is when using the DocBook XSL Stylesheets to convert a document into PDF: some of the imported stylesheets use external entities and the transformation cannot succeed if external entities are switched off.

To solve this issue, a set of properties should be added to define the default values of parser configurations.

evlist commented 10 years ago

The default configuration is currently defined in XMLUtils.java as a static final variable:

public static final ParserConfiguration PLAIN = new ParserConfiguration(false, false, false);

This variable is used to parse XML documents before the resource manager has been set and properties.xml can be read and using methods from org.orbeon.oxf.properties.Properties.instance().getPropertySet() will fail until properties.xml has been processed.

A solution is to replace this variable by a static method that tries to access these properties until success and return a default property before that, for instance:

public static final String DEFAULT_XMLPARSER_VALIDATING_PROPERTY = "oxf.xml-parser.validating";
public static final String DEFAULT_XMLPARSER_HANDLE_XINCLUDE_PROPERTY = "oxf.xml-parser.handle-xinclude";
public static final String DEFAULT_XMLPARSER_EXTERNAL_ENTITIES_PROPERTY = "oxf.xml-parser.external-entities";

private static ParserConfiguration defaultConfiguration = null;

public static ParserConfiguration getDefault() {
    if (defaultConfiguration != null) {
        return defaultConfiguration;
    }
    ParserConfiguration config;
    try {
        config = new ParserConfiguration(
                org.orbeon.oxf.properties.Properties.instance().getPropertySet().getBoolean(DEFAULT_XMLPARSER_VALIDATING_PROPERTY),
                org.orbeon.oxf.properties.Properties.instance().getPropertySet().getBoolean(DEFAULT_XMLPARSER_HANDLE_XINCLUDE_PROPERTY),
                org.orbeon.oxf.properties.Properties.instance().getPropertySet().getBoolean(DEFAULT_XMLPARSER_EXTERNAL_ENTITIES_PROPERTY));
    } catch (Exception e) {
        return new ParserConfiguration(false, false, false);
    }
    defaultConfiguration = config;
    return config;
}

The corresponding properties can then be set in properties-local.xml, for instance:

<property as="xs:boolean" name="oxf.xml-parser.external-entities"                value="true"/>
ebruchez commented 10 years ago

I moved this to a branch default-parser-configuration-1614 as the commit broke the build.

evlist commented 10 years ago

af5be53 should fix issue with the build, sorry!

ebruchez commented 10 years ago

Now the question is whether we should do it this way ;)

One drawback I see is that if you change the default configuration for the whole platform, then you open for example Ajax requests to potential security issues. In fact that's why external entities were disabled by default.

So wouldn't it be better to be able to enable this more selectively?

evlist commented 10 years ago

Yes, ideally that should be switchable for a specific task such as a XSLT transformation.

An option would be to add new attributes used by XSLTTransformer to choose a different parser configuration when it creates its URIResolver.

Do you want me to try that?

ebruchez commented 10 years ago

Sure! TransformerURIResolver is already passed a configuration, including from XSLTTransformer.

This could be done in two ways:

  1. Similarly to what we do for other fairly global configurations:

    <property
       as="xs:string"
       processor-name="oxf:builtin-saxon"
       name="location-mode" value="none"/>
  2. And/or with extra parameters to the processor configuration.