Separate tree building/tree iteration/xpath queries from document encoders/decoders

GoogleCodeExporter commented 9 years ago

Pugixml support for iterators is nice and gives a powerful API when building 
your tree structures. Plus xpath queries are so elegant when querying your 
trees that I dont want to iterate trees (any kind) anymore. I'm glad with that, 
best of two worlds in a single lib :)

But for a few reasons I think it would be really nice to import/export also 
custom document trees in different encodings like json, yaml, msgpack and so 
on. Maybe you could split the current document XML writers/loaders thru an 
optional callback API that defaults to XML? So it does not break compatibility 
at all :)

Well those are my thoughts right now (Did I make sense with my request?)
Keep up the good work dude :)

Original issue reported on code.google.com by rlyeh.no...@gmail.com on 17 Jul 2012 at 10:12

GoogleCodeExporter commented 9 years ago

Okay, let me clarify this to make sure I got it right.

What you'd like to see is some way of running XPath queries on hierarchical 
data that is sort of like XML (i.e. nodes with names and attributes plus text 
nodes), but does not come in an XML form.

There are two ways to go about it - either try to make the current XPath code 
generic with respect to tree structure (i.e. be able to have a custom JSON 
document model, and parametrize XPath implementation with JSON -> XML mapping), 
or load document (JSON) data in xml_document object and use it with XPath.

The first way is somewhat complicated - it requires restructuring the entire 
XPath implementation. I'd rather not go there...

The second way is possible right now - just use tree modification (i.e. 
append_child, set_value, etc.) to convert JSON document to XML document. The 
only difference from document.load_file is performance - you won't be able to 
parse XML externally like this at the same speed pugixml does it internally. If 
performance is a critical factor, I might be able to provide some lower-level 
API for creating a document from scratch using callbacks (i.e. supposing that 
the parser provides a SAX-like callbacks, allow pugixml to build a tree by 
givin the parser suitable callback implementations).

Original comment by arseny.k...@gmail.com on 2 Aug 2012 at 8:53

GoogleCodeExporter commented 9 years ago

Yep thats the idea.

I find useful to decouple doc encoders/decoders from iteration/xpath, then
provide a few new callbacks to be able to load and save other
simpler-than-XML encodings like json.

I know I can convert from json to xml, and then do my xpath queries and
tree handling, and then save to a new xml, and then convert back to json. I
was adding just my two pennies since xpath is not very extended tbh and
your library handles it *FAST* like no other. JSON and webservers like
NodeJS are more than popular today and we are using JSON in C++ lately
quite a bunch, so I though it would be nice and useful to have json and
other encodings in your library too. I cannot be the only one who thinks
this way : )

Keep up the good work!
- rlyeh

Original comment by rlyeh.no...@gmail.com on 3 Aug 2012 at 10:31

GoogleCodeExporter commented 9 years ago

Pretty sure this will never happen so I'll just close this...

1. Abstracting XPath (which is the main point of this issue, I guess) from XML 
DOM is hard both from implementation perspective and from interfacing as well 
(XPath is very much dependent on the XML in the presence of attributes, 
position() behavior, namespaces, etc.)

2. JSON is actually sufficiently different from XML that using pugixml for JSON 
data does not necessarily make sense - you can settle on a common subset but 
that's not general. It is possible to apply approaches pugixml takes wrt 
modeling DOM and optimizing parsing to JSON, but that would really mean writing 
pugijson that shares ideas and maybe starts with pugixml codebase, but is 
otherwise a separate project.

Original comment by arseny.k...@gmail.com on 25 Aug 2014 at 5:46

Changed state: WontFix

falcong / pugixml

Separate tree building/tree iteration/xpath queries from document encoders/decoders #165