yawlfoundation / yawl

Yet Another Workflow Language
http://www.yawlfoundation.org
GNU Lesser General Public License v3.0
90 stars 35 forks source link

Invalid XML if WorkItemRecord class extended attributes contain quotes #612

Closed mlawry closed 6 years ago

mlawry commented 6 years ago

I'm getting an exception in WorkletService.handleCheckWorkItemConstraintEvent() because the method it calls ends up calling WorkItemRecord.toXML(). The exception stacktrace is:

org.jdom2.input.JDOMParseException: Error on line 1: Element type "workItemRecord" must be followed by either attribute specifications, ">" or "/>".
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:232) ~[jdom2-2.0.6.jar:?]
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:303) ~[jdom2-2.0.6.jar:?]
at org.jdom2.input.SAXBuilder.build(SAXBuilder.java:1196) ~[jdom2-2.0.6.jar:2.0.6]
at org.yawlfoundation.yawl.util.JDOMUtil.stringToDocument(JDOMUtil.java:100) [yawl-lib-4.1.jar:?]
at org.yawlfoundation.yawl.util.JDOMUtil.stringToElement(JDOMUtil.java:115) [yawl-lib-4.1.jar:?]
at org.yawlfoundation.yawl.worklet.exception.ExceptionService.augmentItemData(ExceptionService.java:1241) [yawl-lib-4.1.jar:?]
at org.yawlfoundation.yawl.worklet.exception.ExceptionService.handleCheckWorkItemConstraintEvent(ExceptionService.java:130) [yawl-lib-4.1.jar:?]
at org.yawlfoundation.yawl.worklet.WorkletService.handleCheckWorkItemConstraintEvent(WorkletService.java:333) [yawl-lib-4.1.jar:?]
at org.yawlfoundation.yawl.engine.interfce.interfaceX.InterfaceX_ServiceSideServer.processPostQuery(InterfaceX_ServiceSideServer.java:139) [yawl-lib-4.1.jar:?]
at org.yawlfoundation.yawl.engine.interfce.interfaceX.InterfaceX_ServiceSideServer.doPost(InterfaceX_ServiceSideServer.java:100) [yawl-lib-4.1.jar:?]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:648) [tomcat8-servlet-api-8.0.32.jar:?]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729) [tomcat8-servlet-api-8.0.32.jar:?]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292) [tomcat8-catalina-8.0.32.jar:8.0.32]
...

The problem seems to be that my WorkItemRecord contains extended attributes with double quotes in the value. WorkItemRecord.toXML() would output something like this:

<workItemRecord instructions="{"accountId":null,"isSignOff":null}">
    <id>103:fb5ef46e-e3a7-48a7-a444-aa606191ecbc</id>
    <specversion>1.0</specversion>
    <specuri>Test</specuri>
    <caseid>103</caseid>
    ...

The instructions attribute stores JSON data so the resulting XML is invalid because it doesn't encode the double quotes. I had a look at the WorkItemRecord class and saw that it stores extended attributes as both a String (variable _extendedAttributes) and a Map<String, String> (variable _attributeTable). The toXML() method uses the String variable _extendedAttributes, which gets its value from the _attributeTable Map. The conversion code between these two variables don't encode or decode special characters, which causes the invalid XML output.

I know there are the JDOMUtil.encodeEscapes(String) and JDOMUtil.decodeEscapes(String) methods, but I'm not sure if using them when converting between _attributeTable and _extendedAttributes variables will not raise something like Issue #609 again if extended attributes contain newlines. If we use the new JDOM.encodeAttributeEscapes(String) method, then we also need an equivalent method to decode numeric entities such as &#xA; that are generated by the encode process.

I had a look around and the simplest method that can do the decoding is StringEscapeUtils.unescapeXml(String) from commons-lang3-3.6.jar (which is already available in the classpath). That method ends up using a NumericEntityUnescaper which will decode any numerically encoded entity (in addition to the special named XML entities).

Confusingly, the StringEscapeUtils class in commons-lang 3.6 is deprecated in favour of the same class in commons-text, which is actually adapted from commons-lang 3.5. So may be just ignore the deprecation warning unless commons-text is available.

yawlfoundation commented 6 years ago

I've done a refactor of WorkItemRecord to resolve this issue. I've handled the unescaping with JDOM, to save using a deprecated method.

mlawry commented 6 years ago

Thanks!