senjuhashirama / pugixml

Automatically exported from code.google.com/p/pugixml
0 stars 0 forks source link

Permit allocators to throw #226

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
First, handling out-of-memory conditions in code that builds DOMs by checking 
every operation is *really* awkward. It's even more awkward due to the fact 
that pugi itself does not always forward the error-condition to the caller. For 
example:

    PUGI__FN xml_node xml_node::append_child(const char_t* name_)
    {
        xml_node result = append_child(node_element);

        result.set_name(name_);

        return result;
    }

If set_name tries to allocate memory and fails, the only way I can detect this 
is to write something like

    n = a.append_child(str);
    assert(n && strcmp(n.name(), str) == 0);

Even if all such functions were correctly forwarding the error-condition, it 
would be still awkward to check them all in code like:

    xml_document xml;
    xml_node node = xml.append_child("XYZ");
    node.append_child("XYZ0").text() = value0;
    node.append_child("XYZ1").text() = value1;
    node.append_child("XYZ2").text() = value2;
    node.append_child("XYZ3").text() = value3;
    node.append_child("XYZ4").text() = value4;
    node.append_child("XYZ5").text() = value5;
    node.append_child("XYZ6").text() = value6;
    xml.save_file(filename);

Obviously, in some robust code, I prefer to detect out-of-memory and rollback 
rather than overwriting a file with a DOM with missing nodes.

Now, the easiest way to do this would be to override the allocator with one 
that throws bad_alloc on failure. However, currently the documentation says 
that "throwing an exception from allocation function results in undefined 
behavior". I took some effort inspecting the code, and it appears that it is 
not that hard to make it exceptio-safe, providing at least the basic guarantee. 
I attach a patch that, unless I missed something, makes it possible to override 
the allocator with a throwing one.

Original issue reported on code.google.com by ybungalo...@gmail.com on 27 Feb 2014 at 1:44

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by arseny.k...@gmail.com on 10 Aug 2014 at 9:51

GoogleCodeExporter commented 9 years ago
I applied the part of this patch that deals with DOM modification.

Things that still need to happen (for future reference when I get a bit more 
time):

1. Exceptions caused during parsing can lead to:
A. allocator state being discarded => pages leaked ("update allocator state" 
comment) - addressed by the patch
B. document buffer not tracked in the document => buffer leaked ("grab onto 
buffer" comment)

2. Exceptions caused during buffer allocation can lead to:
A. file being left open ("fclose") - addressed by the patch
B. original document buffer not deleted if encoding conversion is performed 
("delete original buffer" comment)

3. Exceptions inside parser or buffer allocation in append_buffer can lead to:
A. extra buffer element and/or the buffer data leaking ("add extra buffer to 
the list" comment)

I *think* that's it but I'll need to do another pass over the code to make sure.

Original comment by arseny.k...@gmail.com on 11 Aug 2014 at 12:22

GoogleCodeExporter commented 9 years ago
I attach an updated patch that fixes the rest of the places you mentioned. Note 
that it is untested.

Also it appears to be possible to simplify further. Specifically, the 
`own_buffer` can itself be a part of the `extra` list, which would cost one 
paged allocation. Don't even know why it was a part of `xml_document` before. 
Historical reasons?

Original comment by ybungalo...@gmail.com on 11 Aug 2014 at 9:03

Attachments:

GoogleCodeExporter commented 9 years ago
This issue was moved to GitHub: https://github.com/zeux/pugixml/issues/17

Original comment by arseny.k...@gmail.com on 26 Oct 2014 at 9:09