tfussell / xlnt

:bar_chart: Cross-platform user-friendly xlsx library for C++11+
Other
1.45k stars 406 forks source link

Fixing memory leaks and reducing memory usage with unique_ptr #666

Open ThibaultDECO opened 1 year ago

ThibaultDECO commented 1 year ago

Fixing memory leaks and reducing memory usage by switching to a unique_ptr. This helps improves performance as discussed in issue #648.

ThibaultDECO commented 1 year ago

I have no idea why it doesn't pass the checks, it works perfectly fine for me...

ThibaultDECO commented 1 year ago

Fixed the issue that prevented to pass the test.

tfussell commented 1 year ago

I'm sorry for the slow follow-up here. I finally got around to testing your change using the three benchmarks. Here are my results (all benchmark values are in milliseconds):

benchmark step PR master improvement
styles generate formats 24.336 17.165 -42%
styles set values 0.105 0.097 -8%
styles save workbook 0.125 0.115 -9%
styles load workbook 0.833 0.779 -7%
styles read values 0.009 0.008 -13%
spreadsheet-load load large 1866.758 1847.345 -1%
spreadsheet-load load very large 4091.759 4208.674 3%
spreadsheet-load save large 1549.886 1959.543 21%
spreadsheet-load save very large 4172.468 5617.708 26%
writer 10000x1 61.9867 58.5843 -6%
writer 1000x10 49.8517 49.5619 -1%
writer 100x100 48.2282 48.0404 0%
writer 10x1000 48.7503 45.7044 -7%
writer 1x10000 64.9189 62.4298 -4%

As you can see, there's a good improvement when using your code in saving a large file (5.6s -> 4.2s), but it comes at the cost of slower style/format creation. Formats are large data structures so I guess there's some optimization being lost switching to unique_ptr.

As it stands, I like that the code is cleaner, but it doesn't seem to have a large overall performance impact in either direction. Do you know if there were memory leaks caused by the optional class as it was?

One final note is that std::make_unique isn't available in C++11. It might be time to rethink targeting a newer standard like C++14, but for now we'll need to get rid of make_unique. I think optional_ptr.reset(new T(*other.optional_ptr)); should be equivalent.

This isn't totally related to the PR, but I'm also curious about the possibility of using some more battle-tested header-only optional implementations such as https://github.com/martinmoene/optional-lite or https://github.com/TartanLlama/optional/pulls. These mostly follow the C++17 optional API and would simplify a future migration.

ThibaultDECO commented 1 year ago

Thank you for the benchmark, very interesting!

I believe there were memory leaks caused by the optional class as it was. When trying to load very large files (160MB with 800,000 rows x 80 columns, filled with doubles), the program takes forever to load, using more and more memory until it crashes, thus failing to load the file. I have 16GB of RAM on my computer. When switching to smart pointers, the progam was able to load my file. I believe the possibility to open very large files outweigh the possible loss of a few milliseconds on style/format creation, in my view.

The other reason I did this is that switching to shared pointers in the future could further decrease memory usage when files contain duplicate data which could be grouped under a shared pointer.