john-liu / jaql

Automatically exported from code.google.com/p/jaql
0 stars 0 forks source link

Bug of XmlToJsonFn #54

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I find that XmlToJsonFn has the following problems. Is my understanding
correct? I have produced fix for these 3 problems. If my understanding is
correct, I will commit my fix.

1. There is alway a null in the resulted JSON array.
xmlToJson("<?xml version=\"1.0\" encoding=\"UTF-8\"?><a><z>China</z></a>");

{
  "a": [
    null,
    {
      "z": [
        null,
        "China"
      ]
    }
  ]
}

2. A node can't have two child nodes with the same name.
xmlToJson("<?xml version=\"1.0\" encoding=\"UTF-8\"?><a><z>1</z><z>2</z></a>");

java.lang.RuntimeException: duplicate field name: z
java.lang.RuntimeException: duplicate field name: z
        at
com.ibm.jaql.json.type.BufferedJsonRecord.addOrSet(BufferedJsonRecord.java:218)
        at
com.ibm.jaql.json.type.BufferedJsonRecord.add(BufferedJsonRecord.java:201)
        at
com.ibm.jaql.lang.expr.xml.XmlToJsonHandler2.endElement(XmlToJsonFn.java:331)
        at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Abstract
SAXParser.java:601)
        at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndEl
ement(XMLDocumentFragmentScannerImpl.java:1774)
        at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentC
ontentDriver.next(XMLDocumentFragmentScannerImpl.java:2930)
        at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentS
cannerImpl.java:648)
        at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocum
entScannerImpl.java:140)
        at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocum
ent(XMLDocumentFragmentScannerImpl.java:510)
        at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configu
ration.java:807)
        at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configu
ration.java:737)
        at
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:107)
        at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXPa
rser.java:1205)
        at
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXPar
serImpl.java:522)
        at com.ibm.jaql.lang.expr.xml.XmlToJsonFn.eval(XmlToJsonFn.java:84)
        at com.ibm.jaql.lang.Jaql.run(Jaql.java:418)
        at com.ibm.jaql.lang.Jaql.run(Jaql.java:71)
        at
com.ibm.jaql.util.shell.AbstractJaqlShell.runInteractively(AbstractJaqlShell.jav
a:50)
        at
com.ibm.jaql.util.shell.AbstractJaqlShell.main(AbstractJaqlShell.java:85)
        at JaqlShell.main(JaqlShell.java:271)

3. Attributes are nested.
xmlToJson("<?xml version=\"1.0\" encoding=\"UTF-8\"?><pre1:book
xmlns:pre1=\"http://www.pre1.com\"
xmlns:pre2=\"http://www.pre2.com\"><pre1:chapter pre1:name=\"Intro\"
pre2:length=\"\">Programming</pre1:chapter></pre1:book>");

{
  "http://www.pre1.com": {
    "book": [
      null,
      {
        "http://www.pre1.com": {
          "chapter": [
            null,
            {
              "http://www.pre1.com": {
                "@name": "Intro",
                "http://www.pre2.com": {
                  "@length": ""
                }
              },
              "text()": "Programming"
            }
          ]
        }
      }
    ]
  }
}

Original issue reported on code.google.com by yaojingguo@gmail.com on 25 Sep 2009 at 8:48

GoogleCodeExporter commented 9 years ago
The first two problems are caused by the logic getting reversed somehow:

Line 329 should be negated:
      if( ! parent.containsKey(jLocalName) )

The latter bug is caused by adding attributes to the wrong record in 
startElement();
it should use the originally created record on each loop iteration.

While you're fixing this code, you could also eliminate insignificant text() 
fields
(those with only whitespace).  The example below produces one.  It is the same 
as
your example above, but shows how to avoid quoting issues using "here doc" 
syntax, as
in perl.  You could also use single quotes in jaql and double quotes in xml to 
avoid
escaping the quotes (but you'd still have to escape backslashes).

xmlToJson(<<ENDDOC
<?xml version="1.0" encoding="UTF-8"?>
<pre1:book xmlns:pre1="http://www.pre1.com" xmlns:pre2="http://www.pre2.com">
  <pre1:chapter pre1:name="Intro" pre2:length="">Programming</pre1:chapter>
</pre1:book>
ENDDOC );

Original comment by dr.be...@gmail.com on 7 Oct 2009 at 9:48

GoogleCodeExporter commented 9 years ago
Fixed in Revision 382. Thanks for the suggestion of "here doc" syntax for 
embedding
XML when writing documents.

Original comment by yaojingguo@gmail.com on 9 Oct 2009 at 1:21