sanand0 / xmljson

xmlsjon converts XML into Python dictionary structures (trees, like in JSON) and vice-versa.
MIT License
121 stars 33 forks source link

tail nodes not covered #34

Closed ariejdl closed 5 years ago

ariejdl commented 5 years ago

when parsing a node, the 'tail' text is lost e.g. 'text2' in <a><span>text</span>text 2</a>, it would be nice if the library used the .tail attribute of lxml nodes. By the way, as a result of the lack of this feature I moved to the xmltodict library.

sanand0 commented 5 years ago

@ariejdl -- you're right. Tail nodes are ignored.

This library focus on converting XML to JSON (and vice versa) in line with standard conventions. I definitely see how preserving the tail (or head) would help. But there may be many ways of doing it, and I'd rather latch on to some standard people have defined.

Since you're exploring this space, have you seen others convert XML (or HTML) data of this kind into a JSON-like structure preserving the head or tail? How do they do it, please?

sanand0 commented 5 years ago

I just realized that this is a duplicate of #14 -- closing this thread. We can continue the conversation on #14 @ariejdl