shepmaster / sxd-document

An XML library in Rust
MIT License
154 stars 36 forks source link

Rename Root::append_child to Root::set_child? #56

Open dvtomas opened 6 years ago

dvtomas commented 6 years ago
    {
        let doc = package.as_document();
        let hello = doc.create_element("hello");
        let world = doc.create_element("world");
        doc.root().append_child(hello);
        doc.root().append_child(world);

        let stdout = &mut io::stdout();
        format_document(&doc, stdout).ok().expect("unable to output XML");
        stdout.flush();
    }

expected output: <?xml version='1.0'?><hello/><world/> actual output as of v. 0.2.6. : <?xml version='1.0'?><world/>

Is this a bug in the code or in my expectations?

dvtomas commented 6 years ago

To answer myself: XML must have exactly one root element, so the behavior of the code is probably correct. Perhaps it would make sense to rename append_child to set_root to avoid confusion?

shepmaster commented 6 years ago

Yup, there can only be a single element at the top. However, there can be multiple other types of nodes, notably comment and processing instruction nodes.

The name of the method is indeed unfortunate. At the very least, the documentation should be updated to clarify that subsequent elements take the place of previous ones.

I'm a little loathe to actually change the name as it is correct for 2/3 types you can append...

dvtomas commented 6 years ago

Ahh, makes sense now, thanks for clarification. Yeah, I guess just explaining this in the function documentation would do the job.

dvtomas commented 6 years ago

On a related note, I noticed that there is no newline after the <?xml ...> declaration from a format_document output. I assume that it does not matter semantically, but is it possible to somehow add it just for human viewing pleasure, prehaps by doing some kind of append_child(newline node) or something? Sorry if this is a dumb question, I'm not an expert on XML specification...

shepmaster commented 6 years ago

it does not matter semantically

That's correct. It is better from a storage / transfer point of few — the less bytes the better.

somehow add it just for human viewing pleasure

You can always write the newline yourself to whatever output you are using. In your example, it's easy:

let stdout = &mut io::stdout();
format_document(&doc, stdout).ok().expect("unable to output XML");
println!();
stdout.flush();

There's been some discussion about having a "pretty" mode for format_document that would also indent nested children, which this could conceivably also do.

dvtomas commented 6 years ago

Aha, I didn't even realize the output is not pretty-printed. When I started toying with sxd-document, I parsed an already pretty-printed document first and then used the writer to write it again. When I saw the pretty printed output, I thought that it was a writer feature, I didn't realize it was just preserving the whitespace in the input (which probably is the case).

For my use-case, a pretty printed output would be very useful, but that belongs to another issue.

As far as I'm concerned about this issue, just stating the potentially confusing behavior in the documentation would be enough for me.

But, if you really wished to make some changes, I think type-wise it would be probably more correct to change

pub struct Root {
    children: Vec<ChildOfRoot>,
}

pub enum ChildOfRoot {
    Element(*mut Element),
    Comment(*mut Comment),
    ProcessingInstruction(*mut ProcessingInstruction),
}

into

pub struct Root {
    element: *mut Element
    children: Vec<ChildOfRoot>,
}

pub enum ChildOfRoot {
    Comment(*mut Comment),
    ProcessingInstruction(*mut ProcessingInstruction),
}

and modify the accessors/mutators accordingly, but the questions is, whether the benefits are worth the troubles (breaking backwards compatibility, changing API).

shepmaster commented 6 years ago

The problem with your proposed structure is that you are allowed to put comments/PIs before and after the element proper. All of these are valid:

<!-- one --><!-- two --><thing />
<!-- one --><thing /><!-- two -->
<thing /><!-- one --><!-- two -->
dvtomas commented 6 years ago

OK, then something like

pub struct Root {
   pre_children: Vec<ChildOfRoot>, 
   element: *mut Element
   post_children: Vec<ChildOfRoot>,
}

would (I hope) match the XML model 1:1. If it's worth the added complexity is hard to say, I don't have an opinion on that.