amorlzu / pugixml

Automatically exported from code.google.com/p/pugixml
0 stars 0 forks source link

reason for non-copyable #166

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
My first encounter with pugixml was not that great.

Having the  xml_document class non-copyable makes it difficult and non 
intuitive to interact with the data.

Creating some xml in a class function and returning it, by copy, a trivial 
task, seem to be really difficult. Should not be IMHO, and don't need to be. I 
can't find any examples either on this.

I bet 99% of users bang their head on this..

I'll continue to try to figure it out...

Original issue reported on code.google.com by tottek on 24 Jul 2012 at 9:30

GoogleCodeExporter commented 9 years ago
Copying xml document is somewhat slow (or quite slow for large documents), 
somewhat dangerous (copying produces an unrelated documents; if you copy a 
document and destroy the old one then all node/attribute handles become 
invalid), and generally something people usually do by accident.

A perfect example of that is the issue 17, which you added a comment to - in 
that case copy is clearly an error. Another example would be an 
std::vector<xml_document> - again, pushing new element has a chance to trigger 
a reallocation, which leads to invalidation of all node/attribute references 
for all elements previously contained in the vector.

Moving an xml document can be made more or less safe, with certain internal 
changes - however, it's C++11 only, and the changes will cause other problems 
(currently creating an empty xml document does not perform heap allocations, 
which is important for some applications; I believe supporting move without 
removing this will be impossible).

Finally, I don't think I remember many people complaining about this issue.

Back to your problem at hand. You have two options:

1. Return a heap-allocated xml_document. This requires an extra heap allocation 
and - preferably - a smart pointer of some sort (std::auto_ptr or std/boost 
shared_ptr or std/boost unique_ptr for C++11). I.e.:

std::auto_ptr<pugi::xml_document> createDocument(...)
{
    std::auto_ptr<pugi::xml_document> doc(new pugi::xml_document);
    ...
    return doc;
}

2. Fill a provided document object passed by reference, i.e.:

void createDocument(pugi::xml_document& doc, ...)
{
    doc.reset(); // only needed if you don't use loading functions and expect non-empty document to be passed
    ...
}

Note that both techniques are standard ways in C++98 for dealing with either 
non-copyable objects or copyable objects that are quite expensive to copy (i.e. 
std::map<std::string, std::vector<int> >).

Original comment by arseny.k...@gmail.com on 25 Jul 2012 at 7:13

GoogleCodeExporter commented 9 years ago
Thanks for your reply.
In my application, and most xml document I come across, the amount of data is 
non-significant. For such, not being able to make copies is a nuisance, imho. 
Dealing with large xml data objects, the client should be aware, and will, when 
trying, that a copy will be costly. Then they can choose to not do it.

From my point of view, an xml document is just a list of strings, and some 
utility functions to organize them into a form that conforms with a standard. 
I.e, as a client, I don't care (and should not) about the internals. 

That is why the map example you mentioned is perfectly copyable. Most objects 
would be of no worth if they could not be copied. If expensive, so what? It 
only make sense to make an object noncopyable when the object represent 
something that is not supposed to be copy-able, like a printer or an audio card 
in a computer. 

In issue 17, its a compile error because the class fails/prevents to provide 
the user with a copy ctor. 

Why not providing the user with a shallow copy function (clone), that only 
copies handles to internal memory?

I'll test to use pugi as a transient xml formatter, i.e. strings -> xml 
formatted string. 

-totte

Original comment by tottek on 25 Jul 2012 at 8:10

GoogleCodeExporter commented 9 years ago
I would argue that document is not just a container, the usage policy is more 
complicated. To give you another example - you might say that a string stream 
is just a wrapper for a string with convenient formatting functionality - 
however, stringstreams in STL are non-copyable.

An issue 17 is a perfect example why having copy ctor disabled helps. It's a 
compiler error - but the code that is written there is incorrect. If it 
compiled fine, user would face huge invisible performance hits.

Unfortunately, it's impossible to provide a fast shallow copying function 
because of a conflicting requirement that empty xml document does no 
allocations.

If your application is in dire need of copying xml documents, you're free to 
implement a copy ctor/assignment operator:

class xml_document_copyable: public pugi::xml_document
{
public:
    xml_document_copyable() {}
    xml_document_copyable(const xml_document_copyable& rhs) { reset(rhs); }
    xml_document_copyable& operator=(const xml_document_copyable& rhs) { if (this != &rhs) reset(rhs); }
};

Finally, I'd like to note that apart from your issue, there was only one issue 
where this feature is requested - so the value is greatly exaggerated.

Original comment by arseny.k...@gmail.com on 2 Aug 2012 at 8:44

GoogleCodeExporter commented 9 years ago
Issue 225 has been merged into this issue.

Original comment by arseny.k...@gmail.com on 8 Feb 2014 at 11:12