martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON
MIT License
5.49k stars 462 forks source link

More friendly "@" and "#text", check for existing children in postprocess()? #326

Open ferventgeek opened 1 year ago

ferventgeek commented 1 year ago

My project is a wrapper on a clunky XML interface. Not SOAP, but you can see it from there. I'm trying to make it easy for users to access data using dot notation, but "@" is causing issues for them. foo.bar['@baz'] does work, but foo.bar.baz would be cleaner. Better, it would hide the XML source semantics from "modern" devs who decry such shenanigans.

The code below can convert attributes and text fields to something more normal, but I'd like to add a check to see if a child property already exists with the intended key, and then only if it's a dupe add the "@" prefix. Am I missing an obvious setting?

<foo>
   <bar baz="Hello">world!</bar>
</foo> 

Would produce

"foo": {
   "bar": {
      "baz": "Hello",
      "value": "world!"
   }
}

And in the case of "duped" attribute/children:

<foo>
   <bar baz="Hello">
      <baz>What</baz>
      <value>123</value>
      world!
   </bar>
</foo> 

Would produce

"foo": {
   "bar": {
      "@baz": "Hello",
      "baz": "What",
      "value": 123,
      "#text": "world!"
   }
}

I realize this is more complex than it sounds- order of parsing for example. I'm doing by iterating over the map and creating a new one after the fact, but if it's easy to skip that step then all the better.

Test function

def postprocessor(path, key, value):
    if key[0] == "@":
        return key[1:], value
    elif key == "#text":
        return "value", value
    else:
        return key, value
KeithProctor commented 1 year ago

Yes how do you get rid of the @ and # characters from being applied to the conversion.