There seem to be some problems regarding reference counting implementation for xlnt::format_impl which manifest themselves, particularly when loading an XLSX document where some cells share the same style. E.g. the following crashes in Debug mode in VS for a simple sheet containing two same-style cells A1 and B1:
test.xlsx
Debugging showed that the crash happens on statement 2 due to reading deleted memory at line 447 here (which in this case for me was caused by length exception during attempted std::string copying):
This happens because at statement 1 the format_impl object shared between the two cells (which pattern refers to above) gets destroyed during garbage_collect here:
This happens because iter->references was initially 1 instead of 2, so it gets 0 before call to garbage_collect above. In fact, all format_impl's have reference count 1 after loading because it is only incremented during format creation during stylesheet parsing:
So, this explains how crash happens due to use-after-free at statement 2 . However, there is a use-after-free even earlier, while still executing statement 1, which doesn't cause crash probably because it only changes a reference counter inside deleted format_impl. It happens when that reference counter is decremented for the second time (first time it happens in find_or_create during new_format.fill(...) in cell::fill which results in garbage collection explained above), now inside cell::format:
where has_format returns true because cell_impl's pointer to (now-deleted) format_impl was never cleared. Note that it also increments reference counter of the new format_impl for the second time (first time it also happens in find_or_create).
So, summing up, if format references are supposed to represent how many cells use a particular format, they do so incorrectly now. At least, during loading from file the increments for these references should happen when cells use a format, not when that format is loaded. Also, double reference count increase/decrease is semantically wrong, even if it sometimes just cancels out, and can cause uncaught use-after-free, so there should be consistency about where reference counting happens. It can probably be resolved using the existing system, but I think the least error-prone solution is to get rid of manual reference counting and rewrite the code to use standard C++ smart pointers. I.e. std::shared_ptr<xlnt::format_impl> instead of xlnt::optional<xlnt::format_impl*> as a pointer to format_impl, while format_impls list would then be std::list<std::weak_ptr<xlnt::format_impl>> instead of std::list<xlnt::format_impl>. Though there would be no need to make format_impls iterator-stable, so std::vector could be considered instead of list for the latter. Also note that std::shared_prt will take care of destroying xlnt::format_impl, so when the new garbage_collect will "collect" an expired std::weak_ptr, it will just destroy the control block, and as for deallocation, it depends on whether a single chunk for control block and managed object was allocated (using std::allocate_shared; then it as a whole is deallocated when control block is with the destruction of the last std::weak_ptr) or separate ones (using std::shared_ptr constructors; then memory allocated for managed object is deallcoated when reference count in control block drops to zero and the control block is destroyed/deallocated when the last std::weak_ptr destroys).
There seem to be some problems regarding reference counting implementation for
xlnt::format_impl
which manifest themselves, particularly when loading an XLSX document where some cells share the same style. E.g. the following crashes in Debug mode in VS for a simple sheet containing two same-style cells A1 and B1:test.xlsx Debugging showed that the crash happens on statement
2
due to reading deleted memory at line447
here (which in this case for me was caused by length exception during attemptedstd::string
copying):https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/detail/implementations/stylesheet.hpp#L445-L455
called from here:
https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/styles/format.cpp#L108-L112
which is called from here:
https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/cell/cell.cpp#L682-L686
This happens because at statement
1
theformat_impl
object shared between the two cells (whichpattern
refers to above) gets destroyed duringgarbage_collect
here:https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/detail/implementations/stylesheet.hpp#L379-L408
This happens because
iter->references
was initially1
instead of2
, so it gets0
before call togarbage_collect
above. In fact, allformat_impl
's have reference count1
after loading because it is only incremented during format creation during stylesheet parsing:https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/detail/serialization/xlsx_consumer.cpp#L2943-L2971
... but not when cells using it are added:
https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/detail/serialization/xlsx_consumer.cpp#L848-L901
So, this explains how crash happens due to use-after-free at statement
2
. However, there is a use-after-free even earlier, while still executing statement1
, which doesn't cause crash probably because it only changes a reference counter inside deletedformat_impl
. It happens when that reference counter is decremented for the second time (first time it happens infind_or_create
duringnew_format.fill(...)
incell::fill
which results in garbage collection explained above), now insidecell::format
:https://github.com/tfussell/xlnt/blob/297b331435d6dee09bf89c8a5ad974b01f18039b/source/cell/cell.cpp#L756-L765
where
has_format
returnstrue
becausecell_impl
's pointer to (now-deleted)format_impl
was never cleared. Note that it also increments reference counter of the newformat_impl
for the second time (first time it also happens infind_or_create
).So, summing up, if format references are supposed to represent how many cells use a particular format, they do so incorrectly now. At least, during loading from file the increments for these references should happen when cells use a format, not when that format is loaded. Also, double reference count increase/decrease is semantically wrong, even if it sometimes just cancels out, and can cause uncaught use-after-free, so there should be consistency about where reference counting happens. It can probably be resolved using the existing system, but I think the least error-prone solution is to get rid of manual reference counting and rewrite the code to use standard C++ smart pointers. I.e.
std::shared_ptr<xlnt::format_impl>
instead ofxlnt::optional<xlnt::format_impl*>
as a pointer toformat_impl
, whileformat_impls
list would then bestd::list<std::weak_ptr<xlnt::format_impl>>
instead ofstd::list<xlnt::format_impl>
. Though there would be no need to makeformat_impls
iterator-stable, sostd::vector
could be considered instead of list for the latter. Also note thatstd::shared_prt
will take care of destroyingxlnt::format_impl
, so when the newgarbage_collect
will "collect" an expiredstd::weak_ptr
, it will just destroy the control block, and as for deallocation, it depends on whether a single chunk for control block and managed object was allocated (usingstd::allocate_shared
; then it as a whole is deallocated when control block is with the destruction of the laststd::weak_ptr
) or separate ones (usingstd::shared_ptr
constructors; then memory allocated for managed object is deallcoated when reference count in control block drops to zero and the control block is destroyed/deallocated when the laststd::weak_ptr
destroys).