Closed dbernheisel closed 4 years ago
Lazy loading shared strings reduces performance for worksheets with a small number of shared strings. Given that most excel files have small shared strings, I think it is a better for the majority of users to avoid lazy loading shared strings.
I'm trying to parse the first couple of rows of a large XLSX, and it seems that the entire workbook's SharedStrings is loaded when calling
Creek::Book.new(file)
, which somewhat defeats the purpose of streaming the rows efficiently.I tested the memory performance when loading a 12MB XLSX.
Here are the results:
When I comment out loading SharedStrings, then I don't see that memory bloat:
Is there a way to get shared strings lazily?