Open attermann opened 6 years ago
Anybody have any input here? I can't imagine I'm the only one needing to correctly encode an XML array.
For the following sample:
const xml2js = require('xml2js');
const headers = {
headers: [
{ header: { '$': { 'method': 'INVITE', name: 'X-Mode', value: 'standard' } } },
{ header: { '$': { 'method': 'INVITE', name: 'X-Prop', value: 'default' } } }
]
}
console.log(new xml2js.Builder().buildObject(headers));
I am seeing the following output:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<headers>
<header method="INVITE" name="X-Mode" value="standard"/>
<header method="INVITE" name="X-Prop" value="default"/>
</headers>
Is there any context missing from the question?
The culprit is here. A new parent node is being created each time the loop is iterated. The root-level array handling doesn't have this problem...
I can confirm that I have just run into the same issue 🤒
Try:
headers: { header: [ {'$': { 'method': 'INVITE', name: ‘X-Mode', value: 'standard' }}, {'$': { 'method': 'INVITE', name: ‘X-Prop', value: ‘default' }} ] }
Confirmed findings of @ccamarat that the issue exists only when array is not at root-level:
const xml2js = require('xml2js');
const headers = {
account: {
headers: [
{ header: { '$': { 'method': 'INVITE', name: 'X-Mode', value: 'standard' } } },
{ header: { '$': { 'method': 'INVITE', name: 'X-Prop', value: 'default' } } }
]
}
}
console.log(new xml2js.Builder().buildObject(headers));
Output:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<account>
<headers>
<header method="INVITE" name="X-Mode" value="standard"/>
</headers>
<headers>
<header method="INVITE" name="X-Prop" value="default"/>
</headers>
</account>
Also confirmed that the fix suggested by @dbrooks75 is a suitable workaround:
const xml2js = require('xml2js');
const headers = {
account: {
headers: {
header: [
{ '$': { 'method': 'INVITE', name: 'X-Mode', value: 'standard' } },
{ '$': { 'method': 'INVITE', name: 'X-Prop', value: 'default' } }
]
}
}
}
console.log(new xml2js.Builder().buildObject(headers));
Output:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<account>
<headers>
<header method="INVITE" name="X-Mode" value="standard"/>
<header method="INVITE" name="X-Prop" value="default"/>
</headers>
</account>
Thanks a lot @dbrooks75!
@attermann Can this issue be closed?
While the work around does work, the fact that root-level arrays are interpreted differently than nested arrays seems like a problem. One expects the same input to always produce the same output, regardless of its nesting level.
@ccamarat OK, I see. What you wrote makes sense, but the problem is that xml2js cannot always produce the same output for the same input regardless of nesting level, because there cannot be multiple XML root elements.
More generally: Currently, buildObject
accepts some JS objects that are 'unusual' (meaning parseString
would not produce them) and turns them into more or less useful, but often broken or surprising XML.
I think buildObject
should simply reject most structures that parseString
would not produce. A meaningful error message is better than a broken XML result, and the worst is XML that is well-formed but has an unexpected structure, because it makes you think your code is almost correct, when in fact it's quite wrong...
Example:
buildObject({y: {x: ['a', 'b']}})
currently produces the following XML:
<y>
<x>a</x>
<x>b</x>
</y>
That's fine.
But if we want the same output for the same input regardless of its nesting level, then buildObject({x: ['a', 'b']})
should produce the following XML:
<x>a</x>
<x>b</x>
But that's not a well-formed XML document... :(
I think buildObject({x: ['a', 'b']})
should throw an exception.
I think there are two or three deeper questions here:
parseString
and buildObject
be 'symmetrical'?XML has elements, attributes and text content. JS has objects, arrays and primitive types. xml2js generally represents each XML element as a JS object (and vice versa).
#name
property.)buildObject
behavior is messy in this respect...parseString
uses arrays for the case when there are multiple XML elements with the same name.buildObject
behavior is a bit messy in this respect.parseString
and buildObject
be 'symmetrical'?In general, buildObject
is designed to handle JS objects whose structure is equal (or at least very similar) to the structure of the objects produced parseString
. But JS objects can have many shapes that parseString
would never produce. What should buildObject
do with such objects?
Examples:
buildObject(1)
buildObject
behavior: root
, but also configurablebuildObject([123])
, buildObject(['123'])
, buildObject([123, 456])
buildObject
behavior: produces garbage.buildObject({x: [123]})
buildObject({x: ['foo']})
buildObject
behavior: produces garbage.[[]]
, [[[]]]
, etc.
buildObject
behavior: produces garbage.parseString
be configurable to produce all kinds of desired JS object structures, e.g. top-level arrays, nested arrays, etc.?buildObject
try to produce proper XML for every possible JS object?
It looks like buildObject
has grown organically - whenever there was a feature request, it was added. That's fine, but it has led to behavior that is not well-defined for many corner cases.
In the long run, it would probably be better if buildObject
was a bit stricter: produce well-formed XML for well-formed JS objects, but reject (by throwing an error) JS objects that are not well-formed.
In a nutshell: buildObject
should be flexible, but not so flexible that no one can predict what it will do with a certain JS object.
Don't get me wrong, I love xml2js. It's super easy to use, yet super flexible. APIs that combine both of these properties pretty are pretty rare. Good job!
buildObject({x: ['a', 'b']})
seems fine to me if you set explicitRoot
and rootName
.
@jcsahnwaldt - great analysis. Agreed that my statement was overly general and you picked it apart well.
My impression is that the library is "good enough" for the JavaScript community as a whole, and XML usage in web applications send to be decline declining so devoting energy to making it "perfect" may not be the best use of the maintainers' time.
You made the comment "I think buildObject should simply reject most structures that parseString would not produce. A meaningful error message is better than a broken XML result, and the worst is XML that is well-formed but has an unexpected structure, because it makes you think your code is almost correct, when in fact it's quite wrong..."
Perhaps this is all that's really needed. It would satisfy me, in any case.
@jcsahnwaldt I really appreciate your involvement, looking at issues and answering and closing them. Thanks!
So let me weight in with some history, some early mistakes and some future directions (hopefully).
As such, I am not the original author of xml2js
, I took it over from @maqr as he didn't maintain it and I needed it for some project (which eventually never materialized, but so is the nature of things). I merged in some existing pull requests, wrote tests and created documentation. I also rewrote the library to use CoffeeScript (that was back then when CoffeeScript was popular, nowadays I wouldn't use it anymore, but these were different times). When I took over, the library had already some users and a number of options like trim
etc, that affected the output.
Over time, a number of features were contributed, among these the XML builder (the original name xml2js
only signifies a one way conversion). This is a very popular feature, but I almost never hear back from contributors about maintaining their added feature, so the builder is in a bit of a nowhere-land where it is sort of easy to produce JS objects with the parser that the builder can't serialize back.
I have added a number of options to make the JS object output a little more predictable (like explicitArray
to avoid breaking code if suddenly one element ends up having more than one node and the code suddenly starts failing), but have retained backwards compatibility options with older releases so you can configure it to output exactly the same format it did 7 years ago), while also making the API a bit nicer. But the existence of these options creates issues all the time and the builder jumps through all kinds of hoops to generate some XML from JS objects that don't make sense, so the resulting XML sometimes tends to also not make any sense.
That should not be the case. What I much rather have is to have a fixed, documented semantics (preferably with Flow/TypeScript definitions) on how xml2js
will parse XML (also, XML, not "something that contains < and > and sort of looks like XML if you squint) into a JS object, reliably error if the XML is not well-formed and probably be callback-less (I also plan to stop using sax.js, which uses callbacks but is not actually async, a property that xml2js
unfortunately inherits, but depending on the sync behaviour in xml2js
feels a bit wrong).
So yes, I think the parser and builder should be round-tripable (at least in the semantic sense of XML, I don't necessarily care about reproducing the input XML byte for byte), but for this the parser must lose some options. Fortunately, these options are generally not a good idea either and if people want to do that, they can easily do that as a post-processing step after the parser has finished.
I have planned to rework the code (since, if you look inside it, it is a mess of deeply nested if statements and mysteriously mutated state) in a more typed and rigid way so it is actually possible to maintain the code, but since I don't actually use xml2js
what I am mostly left with is merging contributions I think I can continue to maintain and make sense. I think I said way too often yes to features that were not a good idea 😉
Going back to the issue at hand, I think it would make sense if the builder just rejected JS objects that it can't convert to XML instead of generating a hot mess of nonsense that maybe some other poor soul will have to parse.
I noticed that js2xml (not xml2js) deals with arrays by placing the array items between \<item>\</item> tags
Would that be something that makes sense to you?
@Mesieu After looking at js2xml it left me a little bit puzzled: Why does it insert item
? How do I customize it? What if I want to customize it per-array?
To me the behaviour seems (specified but) arbitrary. I'd rather have a reasonably close correspondence between what xml2js parses to the JS object, which also makes serializing the JS object back to XML more or less unsurprising and does not require adding processing instructions (like "make this array items expand to <item/>
tags").
@Leonidas-from-XIV I'm convinced.
I have a need to encode a js array as xml. For an array like the following:
I am getting this XML:
Instead of the expected:
I suspect I may just be missing something but can’t figure out what it is after reading through the README. Hoping somebody can advise how to make this work.
Thanks!