KWARC / rust-libxml

Rust wrapper for libxml2
https://crates.io/crates/libxml
MIT License
76 stars 38 forks source link

Missing Node::to_string method #126

Closed anwaralameddin closed 8 months ago

anwaralameddin commented 8 months ago

The Document structure provides the method node_to_string that returns an XML representation of a node and its children. A similar method does not seem to be implemented directly on the Node structure. If I understand it correctly, it seems that one can get xmlDocPtr from _xmlNode. Can this be passed to xmlNodeDump used in node_to_string? If so, the Document argument is not needed here, and one should be able to implement Node::to_string.

Currently, I am passing the node and document around merely to be able to call node_to_string when handling errors when, in principle, I only need the node.

If such an addition is not possible or intended, could you please clarify any pitfall one may have when introducing this method using the member doc of the _xmlNode structure? If such a method needs to return an Option depending on whether the node is linked, it remains useful.

dginev commented 8 months ago

You could do that, but it creates a guarantee that Node will always hold a reference to a Document in the future. Currently that is only an internal implementation detail which is potentially subject to change.

And if you look at the read-only version of Node: https://github.com/KWARC/rust-libxml/blob/11f1d4ad2be7265fb59843f1ec4f1e656109e5e4/src/readonly/tree.rs#L16

You'll see that it doesn't carry any connection to its parent document. Neither does the original xmlNodePtr, if I remember correctly. But indeed Node currently keeps a ref to its owner document for bookkeeping.

The one possible gotcha is calling node.to_string() on a Node which was unlinked and then had its owning Document dropped. I think that has to be guarded in a new Node::to_string, i.e. one has to check if DocumentWeak has a usable pointer - and if not - report an error. So this kind of method may need to return Result<String> rather than String.

But if all of that sounds reasonable, I am open to adding it if it makes things more convenient.

anwaralameddin commented 8 months ago

Thank you for your explanation. I am considering the member doc of the libxml2 _xmlNode struct rather than the reference to a Document in Node, more in line with the below:

let node_ptr = self.node_ptr();
let doc_ptr = if node_ptr.is_null() {
  ptr::null_mut()
} else {
  (*node_ptr).doc
};
// dump the node
xmlNodeDump(
  buf, doc_ptr, node_ptr, 1, // level of indentation
  0, /* disable formatting */
);

Unless I am missing something, I think this way, it could be implemented for both Node and RoNode.

anwaralameddin commented 8 months ago

Thank you for pointing out possible issues. Indeed, when the owning document is dropped, tests for such an implementation fail. When it returns a string, it's either the empty string or a representation of an empty element.