SAXParseException reading template document with Freemarker angle brackets

matthiasbasler commented 4 years ago

We are using XDocreport 2.0.1 in order to parse "docx" files and fill out certain fields - the typical "Mail Merge" functionality. We use the "xwpf" converter to finally create a PDF out of it. So far we have been using the square bracket Freemarker syntax, e.g. [#if ...] [/#if] and this worked. Since we are using the angle bracket Freemarker Syntax in the rest of our application we wanted to switch XDocreport over as well. There is a configuration setting to do so. So we set

final IXDocReport report = XDocReportRegistry.getRegistry().loadReport(stream, TemplateEngineKind.Freemarker, false);
final Configuration fmConfig = new Configuration(Configuration.VERSION_2_3_28);
fmConfig.setTagSyntax(Configuration.ANGLE_BRACKET_TAG_SYNTAX);
((FreemarkerTemplateEngine) report.getTemplateEngine()).setFreemarkerConfiguration(fmConfig);

Of course we changed the syntax of the template .docx file as well, e.g. <#if applicant_houseNumber?hasContent> ${applicant_houseNumber}</#if>

Afterwards, we get a SAXParseException before XDocReport even reaches our template model. As far as I can conclude from the stack trace (see below) and the parser state when the exception is thrown, the Xerces SAX parser tries to evaluate the mergefield content <#if ...> and chokes on this because it thinks "<#" is not a valid XML tag. It certainly isn't, but the document parser should not even try to parse the content of a merge field as XML imho.

Unfortunately there is little to find regarding the Configuration.ANGLE_BRACKET_TAG_SYNTAX flag on the web, so I wonder if I overlooked something or whether this is a bug.

Do I have to change something else to avoid the XML parser analyzing the merge field?
Or is this flag known not to work when using this parser? Or known not to work in conjunction with converters, maybe?
Ar there any working examples for using this syntax?

The stack trace (abbreviated to relevant classes) is as follows:

fr.opensagres.xdocreport.converter.XDocConverterException: java.io.IOException: Unable to parse xml bean
    at fr.opensagres.xdocreport.converter.docx.poi.itext.XWPF2PDFViaITextConverter.convert(XWPF2PDFViaITextConverter.java:72) ~[fr.opensagres.xdocreport.converter.docx.xwpf-2.0.1.jar:2.0.1]
    at fr.opensagres.xdocreport.document.AbstractXDocReport.convert(AbstractXDocReport.java:713) ~[fr.opensagres.xdocreport.document-2.0.1.jar:2.0.1]
    ... 96 more
Caused by: java.io.IOException: Unable to parse xml bean
    at org.apache.poi.POIXMLTypeLoader.parse(POIXMLTypeLoader.java:166) ~[poi-ooxml-3.17.jar:3.17]
    at org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument$Factory.parse(Unknown Source) ~[ooxml-schemas-1.3.jar:?]
    at org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:152) ~[poi-ooxml-3.17.jar:3.17]
    at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:169) ~[poi-ooxml-3.17.jar:3.17]
    at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:119) ~[poi-ooxml-3.17.jar:3.17]
    at fr.opensagres.xdocreport.converter.docx.poi.itext.XWPF2PDFViaITextConverter.convert(XWPF2PDFViaITextConverter.java:66) ~[fr.opensagres.xdocreport.converter.docx.xwpf-2.0.1.jar:2.0.1]
    at fr.opensagres.xdocreport.document.AbstractXDocReport.convert(AbstractXDocReport.java:713) ~[fr.opensagres.xdocreport.document-2.0.1.jar:2.0.1]
    ... 96 more
Caused by: org.xml.sax.SAXParseException: Der Content von Elementen muss aus ordnungsgemäß formatierten Zeichendaten oder Markups bestehen.
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1472) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.startOfMarkup(XMLDocumentFragmentScannerImpl.java:2635) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2732) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243) ~[?:1.8.0_212]
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339) ~[?:1.8.0_212]
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) ~[?:1.8.0_212]
    at org.apache.poi.util.DocumentHelper.readDocument(DocumentHelper.java:140) ~[poi-ooxml-3.17.jar:3.17]
    at org.apache.poi.POIXMLTypeLoader.parse(POIXMLTypeLoader.java:163) ~[poi-ooxml-3.17.jar:3.17]
    at org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument$Factory.parse(Unknown Source) ~[ooxml-schemas-1.3.jar:?]
    at org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:152) ~[poi-ooxml-3.17.jar:3.17]
    at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:169) ~[poi-ooxml-3.17.jar:3.17]
    at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:119) ~[poi-ooxml-3.17.jar:3.17]
    at fr.opensagres.xdocreport.converter.docx.poi.itext.XWPF2PDFViaITextConverter.convert(XWPF2PDFViaITextConverter.java:66) ~[fr.opensagres.xdocreport.converter.docx.xwpf-2.0.1.jar:2.0.1]

matthiasbasler commented 4 years ago

I changed the FreeMarker syntax in the template back to square brackets and to my surprise the PDF document was again created well. Which means, that the Configuration.ANGLE_BRACKET_TAG_SYNTAX flag has no effect. Further investigation shows that fr.opensagres.xdocreport.template.freemarker.FreemarkerTemplateEngine.setFreemarkerConfiguration(Configuration) overwrites my tag syxtax flag with Configuration.SQUARE_BRACKET_TAG_SYNTAX, so no surprise it isn't working.

Can anyone please explain to me why there is a FM syntax configuration flag if the API overwrites it to whatever it considers right?

I have following suggestions here:

Please document the above fact in your official documentation. I wasted 3 hours trying to figure out why the API would not cope with my document until I found out that it silently overwrites the setting. This is counterintuitive and must be well documented imho.
Please clearly document what happens if the ANGLE_BRACKET_TAG_SYNTAX setting is forced nonetheless by reversing the order of the statements like this ...
```
final Configuration fmConfig = new Configuration(Configuration.VERSION_2_3_28);
((FreemarkerTemplateEngine) report.getTemplateEngine()).setFreemarkerConfiguration(fmConfig);
// Set the flag afterwards, so it really gets respected
fmConfig.setTagSyntax(Configuration.ANGLE_BRACKET_TAG_SYNTAX);
```
In this case I get a stange error about Freemarker failing to parse the expression "${___info.imageId}" - no idea where this comes from, but certainly not from the content of my template.

matthiasbasler commented 4 years ago

Sorry, accidentially closed -> reopened.

opensagres / xdocreport

SAXParseException reading template document with Freemarker angle brackets #398