Leonidas-from-XIV / node-xml2js

XML to JavaScript object converter.
MIT License
4.84k stars 598 forks source link

Question about parsing XML attribute names #640

Open jasonkhanlar opened 2 years ago

jasonkhanlar commented 2 years ago

I have been using Node.js exec() with jq/yq/xq to convert XML<->JSON, and comparing the output for XML->JSON with this xml2js to xq, the JSON data returned is different.

const { error, stderr, stdout } = await exec(`cat ${file}|xq`, {maxBuffer: 1024 * 1024 * 1024});

value of stdout appears as:

{
  mediawiki: {
    '@xmlns': 'http://www.mediawiki.org/xml/export-0.10/',
    '@xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
    '@xsi:schemaLocation': 'http://www.mediawiki.org/xml/export-0.10/ http://www.mediawiki.org/xml/export-0.10.xsd',
    '@version': '0.10',
    '@xml:lang': 'en',
    siteinfo: {
      sitename: 'Wikipedia',
      dbname: 'enwiki',
      base: 'https://en.wikipedia.org/wiki/Main_Page',
      generator: 'MediaWiki 1.38.0-wmf.24',
      case: 'first-letter',
      namespaces: [Object]
    },
    page: [
      [Object], [Object], [Object], [Object], [Object], [Object]
    ]
  }
}

but when using:

xml2js.parseString(await fs.promises.readFile(file, 'utf8', (err, data) => { if (err) throw err; return data; }), function (err, result) { console.dir(result); });

the output appears as:

{
  mediawiki: {
    '$': {
      xmlns: 'http://www.mediawiki.org/xml/export-0.10/',
      'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
      'xsi:schemaLocation': 'http://www.mediawiki.org/xml/export-0.10/ http://www.mediawiki.org/xml/export-0.10.xsd',
      version: '0.10',
      'xml:lang': 'en'
    },
    siteinfo: [ [Object] ],
    page: [
      [Object], [Object], [Object], [Object], [Object], [Object],
    ]
  }
}

Is there a reason for the difference? or a way to configure the output to match?


Edited to add:

let parser = new xml2js.Parser({
    attrNameProcessors: [ function (name) { console.log(name); return name } ]
});

the names of the attributes still do not appear to retain the '@' symbol at the beginning. I couldn't find any options that preserve the data to appear exactly identical as the source XML data without modification. Is there something I missed?