shepmaster / sxd-document

An XML library in Rust
MIT License
152 stars 36 forks source link

The root is not the root element? #40

Open therealprof opened 7 years ago

therealprof commented 7 years ago

While trying to work with a parsed document I found something strange and rather counter-intuitive: The root does not seem to be the root element of the parsed document so when I iterate over the children of the root I will actually see all top level elements of the document including the root element itself.

let package = parser::load_xml(file);
let doc = package.as_document();
println!("{:?}", doc.root ()); // Yields "Root"

for elem in doc.root().children() {
    println!("Got {:?}", elem);  // Prints 2 elements: A comment and the root element
}
shepmaster commented 7 years ago

Hmm, yes, it appears I've accidentally colluded some bits of terminology. You are correct that the result of Document::root is a node that corresponds to section 4.8 of the XML spec:

The document entity serves as the root of the entity tree

(emphasis mine). This is unfortunate as I've also already used "document" to mean something else; the entrypoint to the allocation functions and owner of the DOM...

I wonder if there's anything preventing combining Document and Root...

therealprof commented 7 years ago

Actually a document is the whole shebang including all metadata like the XML version and encoding and may also contain comments while the root is supposed to be the only regular element directly in the document.

I also don't quite get why there's a distinction between Root and Element in the code as they're exactly the same with the only exception being that a document must have exactly one Root. ChildOfRoot kind of seems superfluous as well (unless I'm missing something) because ChildOfRoot should not be any different than ChildOfElement.

I thought the easiest change would be to change:

    pub fn root(self) -> Root<'d> {
        self.wrap_root(self.connections.root())
    }

to return an Element instead but I failed miserably trying to implement that because a lot of code depends on the "root" being a Root.

shepmaster commented 7 years ago

ChildOfRoot should not be any different than ChildOfElement.

You'll note that ChildOfElement allows a Text element, which ChildOfRoot does not. That is, this text is not valid XML:

<?xml version="1.0">
hello
<thing />
world
therealprof commented 7 years ago

True, but Root at the moment is not actually the root element but what the Document should be so I was trying to change the root fn to actually return the real root element.