Leonidas-from-XIV / node-xml2js

XML to JavaScript object converter.
MIT License
4.91k stars 606 forks source link

Failing to parse CDATA tags. #439

Closed cmseaton42 closed 6 years ago

cmseaton42 commented 6 years ago

The Issue

The following XML is being converted as follows.

<Member Name="Push" DataType="BIT" Dimension="0" Radix="Decimal" Hidden="false" Target="ZZZZZZZZZZPB_OneButt0" BitNumber="0" ExternalAccess="Read/Write">
    <Description>
        <![CDATA[*]]>
    </Description>
</Member>

Converted Object snippet below

{ '$':
    { Name: 'Push',
        DataType: 'BIT',
        Dimension: '0',
        Radix: 'Decimal',
        Hidden: 'false',
        Target: 'ZZZZZZZZZZPB_OneButt0',
        BitNumber: '0',
        ExternalAccess: 'Read/Write' },
    Description: [ '\r\n*\r\n' ] }, // Is this correct?

I am attempting to write this back to another file and the cdata is being written as follows...

<Member Name="Push" DataType="BIT" Dimension="0" Radix="Decimal" Hidden="false" Target="ZZZZZZZZZZPB_OneButt0" BitNumber="0" ExternalAccess="Read/Write">
<Description>&#xD;
*&#xD;
</Description>
</Member>

Why is this occurring? Is there an error in my script (see below)?

const { Parser, Builder } = require("xml2js");
const { readFileSync, writeFileSync } = require("fs");
const util = require("util");

const parser = new Parser();
const builder = new Builder({
    cdata: true
});
const file = readFileSync("./PB_OneButton.xml");
let parsed = null;

parser.parseString(file, (err, data) => {
    if(err) throw err;

    parsed = data;
});

console.log(util.inspect(parsed, false, null));
writeFileSync("./test.xml", builder.buildObject(parsed));
jcsahnwaldt commented 6 years ago

@cmseaton42 The parser and builder most likely work correctly. Your XML contains line breaks before and after the CDATA section which get encoded in different ways in JavaScript and XML (&#xD; is the XML encoding for \r). Try this instead:

<Description><![CDATA[*]]></Description>

If you can't change the input XML, try the trim (and maybe normalize) options of the parser: https://github.com/Leonidas-from-XIV/node-xml2js/blob/master/README.md#options

jcsahnwaldt commented 6 years ago

I think this issue can be closed.

cmseaton42 commented 6 years ago

I ended up using a different module to handle this. The format that I am trying to parse is a superset of XML used be Rockwell Automation (".L5X")