michaelrsweet / mxml

Tiny XML library.
https://www.msweet.org/mxml
Apache License 2.0
428 stars 157 forks source link

Add context, preprocess callback, and whitespace callbacks #66

Closed michaelrsweet closed 7 years ago

michaelrsweet commented 16 years ago

Version: -feature Original reporter: Michael Sweet

For the C++ interface for MiniXML I am working on, I need to be able to set the user_data value for a node from within the load callback. For example, when calling mxmlLoadFile it is not possible to do this. A workaround would be if a top node is specified (say by calling mxmlNewXML and setting the user_data field), but then the top level encoding from the file is repeated.

This would be easiest to do if a user data argument could be added to the load callback. Thus, the function would be:

cb(node,context)

I could then use the context to set the user_data field in the nodes while I am parsing them. Could this be added?

michaelrsweet commented 16 years ago

Original reporter: Michael Sweet

(Making this STR public - this is NOT security related, and this is a public, open source project...)

This isn't something we can change for 2.x, as doing so will seriously break binary compatibility. When we start planning for Mini-XML 3.0, I'll make sure this gets added to the callback.

That said, the type callback is only called for element nodes, so it won't help for setting the user_data pointer for data nodes. If you just need to associate the loaded nodes with a user_data pointer after the fact, wrap mxmlLoad* with a method that walks the node tree and sets all of the user_data pointers.

michaelrsweet commented 16 years ago

Original reporter:

Understood. The context would be used for a virtual callback within the top-level C++ object wrapping a MiniXML node so you can overload the function in the object rather than supplying a C-function callback.

michaelrsweet commented 16 years ago

Original reporter:

After working with MiniXML for quite a bit, it turns out that what is really needed is a callback for all nodes with a context. The issue is that non-element nodes cannot easily be preprocessed. My only alternative to set the user_data field right now is to brute force traverse all nodes to set the pointers.

The same issue occurs with mxmlSaveFile. A callback that gets all nodes (not just element nodes) is needed to properly format output. If you think this is doable, I will make a separate request for that enhancement?

michaelrsweet commented 16 years ago

Original reporter: Michael Sweet

I don't think I want to add a save callback for all nodes. The whole purpose of Mini-XML is to be small and fast.

You might be able to use the custom callback support for both loading and saving data.

Longer term, just passing a user_data pointer to the mxmlLoad and mxmlSave functions should accomplish what you want.

michaelrsweet commented 16 years ago

Original reporter:

Here is what I needed to add for extra save callback states to be able to duplicate what Internet Explorer displays when you show an XML file in it (IE will format the file regardless of what it really looks like in the .xml file):

Add MXML_WS_ATTRIBUTE_WRAP: allows an indent to be added when attributes wrap Add MXML_WS_NODE_WRAP: allows an indent to be added when nodes wrap Add MXML_WS_XML_DECLARATION: callback for what to do after the XML declaration (your default method of adding a newline actually makes this moot, but adding this provides completeness). Add MXML_WS_BEFORE_VALUE: allows indenting for a node with mixed values and elements Add MXML_WS_AFTER_VALUE: allows a space, newline, or nothing to be added, depending on the value and what other node types are siblings.

I understand what you are saying about MiniXML being small and fast, but with anything but the simplest XML files, the human readability for files created with mxmlSaveFile or mxmlSaveStr is not very good. Adding these additional callback states allows complete flexibility to make the output from those functions (near) perfectly reproduce the XML that was loaded.

michaelrsweet commented 16 years ago

Original reporter:

And I agree in principle with the user_data context for mxmlLoad and mxmlSave. That works perfectly, except that if you do not add a callback for the load functions that will process all nodes, then you cannot easily set the user_data pointer for values (for elements, this can be done in the callback for setting the data type).

Having both a context pointer and adding a callback to touch each node is really the only way to have a C++ wrapper for any node type that has a back pointer to the main document (at least, from what I know from your code).

Here is an example of what I mean:

mxml_node_t / O - First node or NULL if the file could not be read. / mxmlLoadFile(mxml_node_t top, / I - Top node / FILE fp, / I - File to read from / mxml_load_cb_t cb, / I - Callback function or MXML_NO_CALLBACK / mxml_process_node_cb_t process_node_cb, / I - Process node callback function or MXML_NO_CALLBACK / void process_node_context / I - Process node context pointer / )

The process_node_cb is called for all elements and values, and can be used to set the user_data field for process_node_context. The only other alternative for setting value user_data is to re-walk the entire document and set the user_data in the second pass. This seems really inefficient when the extra callback solves the problem so smoothly.

michaelrsweet commented 13 years ago

Original reporter: Michael Sweet

Moving this to "Future"; the scope of this really calls for something like libxml2, which offers most of this functionality already. Not sure we'll ever go here...

michaelrsweet commented 7 years ago

Nope, not doing this, although the future callback driven API will provide some of this, I don't want mini-xml to be a replacement for libxml...