tfussell / xlnt

:bar_chart: Cross-platform user-friendly xlsx library for C++11+
Other
1.43k stars 402 forks source link

Reference counting issues with xlnt::format_impl #723

Open YurkoFlisk opened 7 months ago

YurkoFlisk commented 7 months ago

There seem to be some problems regarding reference counting implementation for xlnt::format_impl which manifest themselves, particularly when loading an XLSX document where some cells share the same style. E.g. the following crashes in Debug mode in VS for a simple sheet containing two same-style cells A1 and B1:

#include <xlnt/xlnt.hpp>

int main()
{
    using namespace xlnt;

    workbook wb(path("test.xlsx"));
    worksheet sheet(wb.sheet_by_index(0));
    sheet[{1u, 1u}].fill(fill::solid(color::green())); // 1
    sheet[{2u, 1u}].fill(fill::solid(color::green())); // 2 (Crashes here)
    return 0;
}

test.xlsx Debugging showed that the crash happens on statement 2 due to reading deleted memory at line 447 here (which in this case for me was caused by length exception during attempted std::string copying):

https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/detail/implementations/stylesheet.hpp#L445-L455

called from here:

https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/styles/format.cpp#L108-L112

which is called from here:

https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/cell/cell.cpp#L682-L686

This happens because at statement 1 the format_impl object shared between the two cells (which pattern refers to above) gets destroyed during garbage_collect here:

https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/detail/implementations/stylesheet.hpp#L379-L408

This happens because iter->references was initially 1 instead of 2, so it gets 0 before call to garbage_collect above. In fact, all format_impl's have reference count 1 after loading because it is only incremented during format creation during stylesheet parsing:

https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/detail/serialization/xlsx_consumer.cpp#L2943-L2971

... but not when cells using it are added:

https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/detail/serialization/xlsx_consumer.cpp#L848-L901

So, this explains how crash happens due to use-after-free at statement 2 . However, there is a use-after-free even earlier, while still executing statement 1, which doesn't cause crash probably because it only changes a reference counter inside deleted format_impl. It happens when that reference counter is decremented for the second time (first time it happens in find_or_create during new_format.fill(...) in cell::fill which results in garbage collection explained above), now inside cell::format:

https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/cell/cell.cpp#L756-L765

where has_format returns true because cell_impl's pointer to (now-deleted) format_impl was never cleared. Note that it also increments reference counter of the new format_impl for the second time (first time it also happens in find_or_create).

So, summing up, if format references are supposed to represent how many cells use a particular format, they do so incorrectly now. At least, during loading from file the increments for these references should happen when cells use a format, not when that format is loaded. Also, double reference count increase/decrease is semantically wrong, even if it sometimes just cancels out, and can cause uncaught use-after-free, so there should be consistency about where reference counting happens. It can probably be resolved using the existing system, but I think the least error-prone solution is to get rid of manual reference counting and rewrite the code to use standard C++ smart pointers. I.e. std::shared_ptr<xlnt::format_impl> instead of xlnt::optional<xlnt::format_impl*> as a pointer to format_impl, while format_impls list would then be std::list<std::weak_ptr<xlnt::format_impl>> instead of std::list<xlnt::format_impl>. Though there would be no need to make format_impls iterator-stable, so std::vector could be considered instead of list for the latter. Also note that std::shared_prt will take care of destroying xlnt::format_impl, so when the new garbage_collect will "collect" an expired std::weak_ptr, it will just destroy the control block, and as for deallocation, it depends on whether a single chunk for control block and managed object was allocated (using std::allocate_shared; then it as a whole is deallocated when control block is with the destruction of the last std::weak_ptr) or separate ones (using std::shared_ptr constructors; then memory allocated for managed object is deallcoated when reference count in control block drops to zero and the control block is destroyed/deallocated when the last std::weak_ptr destroys).