MathNya / umya-spreadsheet

A pure rust library for reading and writing spreadsheet files
MIT License
238 stars 41 forks source link

performance improvement suggestion #165

Open Andrew-Lyu opened 6 months ago

Andrew-Lyu commented 6 months ago

umya-spreadsheet is so far the only crate I can find with both read and write feature I need. I do hope my suggestion could make this library better.

below is comparison from my little program. excel file size, about 1M, 3 sheets, the biggest sheet has about 1000 lines and 110 columns calamine + rust_xlsxwriter, less than 1s, about 100M memory umya-spreadsheet: more than 6s, more than 1G memory. My program only process data and format, no chart, image etc.

So here's my suggestion for performance improvement. For memory, I saw every Cell has a Style instance. It should be possible that a Cell only hold a reference of Style like the real excel did, this should reduce the memory usage a lot. For speed, I see you use HashMap for cell collection, consider the cell row number and column number are just integer, you may use nohash hasher instead.

Regards

agentjill commented 6 months ago

I think #158 may be a suitable starting point for the issue.

MathNya commented 6 months ago

@Andrew-Lyu Thank you for your report. Ah, such a difference in performance. This must be improved.

As agentjill said, the first step to improvement would be to optimize the loading to memory.