Leonidas-from-XIV / node-xml2js

XML to JavaScript object converter.
MIT License
4.88k stars 602 forks source link

Bug: non-strict parsing mode converts tag-names to UPPERCASE. #501

Closed Domvel closed 5 years ago

Domvel commented 5 years ago

Bug-Report

I wondering why are the tag-names of a parsed xml UPPERCASE formatted?

Input:

<PascalCase>Foo</PascalCase>

Output:

{ "PASCALCASE": ["Foo"] }

And btw. why is the value wrapped to an array? Ok, I think because a tag could contain tag-nodes as childs. ... But why uppercase?

xml2js 0.4.19 (latest) @types/xml2js 0.4.3 (latest)

Leonidas-from-XIV commented 5 years ago

I don't think it should uppercase. Can you post a minimal example on how this happens?

The result is in an array because you might potentially have multiple things in your tag, like <PascalCase><Haskell/><Idris/></PascalCase> so it stays consistent and is an array.

Domvel commented 5 years ago

Thanks for fast response. See example

const xmlString = `
<?xml version="1.0" encoding="UTF-8"?>
<foo:BLA xmlns:foo="about:blank" Version="1.2.3.4">
  <foo:EntryItem>
    <foo:Id>1</foo:Id>
    <foo:Name>One</foo:Name>
  </foo:EntryItem>
</foo:BLA>
`;

const parser = new xml2js.Parser({ strict: false, trim: true });
parser.parseString(xmlString, (err, result) => {
  this.xml = result;
});

Result:

{
  "FOO:BLA": {
    "$": {
      "XMLNS:FOO": "about:blank",
      "VERSION": "1.2.3.4"
    },
    "FOO:ENTRYITEM": [
      {
        "FOO:ID": ["1"],
        "FOO:NAME": ["One"]
      }
    ]
  }
}
Domvel commented 5 years ago

I see if I remove strict: false it works. (strict: true) But why? btw: I disabled it because I have a "dirty" xml. Anyway why does no-strict uppercase the tag-names?

Leonidas-from-XIV commented 5 years ago

Can you make a minimal example which does not involve dependencies outside of xml2js, like Angular?

Domvel commented 5 years ago

Blank JavaScript Example

Note: The dependencies timers, string_decoder and events are required to work with xml2js (node).

Leonidas-from-XIV commented 5 years ago

That's interesting. Looks like the underlying parser, sax-js is uppercasing tag names for some reason…

Domvel commented 5 years ago

So, this is an issue of the sax-js repository not xml2js?

Leonidas-from-XIV commented 5 years ago

I assume so, since we never actually do anything with the strict option except for passing it to sax-js:

https://github.com/Leonidas-from-XIV/node-xml2js/blob/0f0a2980b3db5366b3d50e4d0da422cb7af100c9/src/parser.coffee#L70-L74

Domvel commented 5 years ago

Ok, your project xml2js uses the sax version 0.6.0, but the latest is 1.2.4. Maybe an update will fix this? But wait ... take a look to this issue on sax repo.

In non-strict mode (like HTML), tags are not case-sensitive, yes. They converted to upper case by default.

I do not think that this gonna be changed. (Late 2011 😮 ) In my opinion the tag-name or strict-mode interpretation should be splitted. Html and xml mode. Anyway, I found another solution for me. It works with default strict: true. Still, I do not think that's right ... At last "it's considered good practice to keep HTML markup lowercase.". Why not just keeping the case format? 🤔

Thanks for your support. 🙂