When POI reads spreadsheets, the entire data is stored in memory to allow random access to specific rows. Reading large files can therefore lead to out-of-memory exceptions.
POI does have a streaming API for reading XLSX but it is complex to use.
However, a third-party POI wrapper - excel-streaming-reader - now exists which makes this much easier.
Integrate this library and use it to support a new readLargeFile() method able to read large XLSX files with less likelihood of running out of memory.
Limitations
Only works with XLSX files
Only works with Lucee (ACF throws an exception for reasons I have been unable to fathom)
Can only return a query object, CSV or HTML (not a workbook object, because too many of the expected POI methods are unsupported by the wrapper)
Doesn't support all read() options, in particular specifying rows or columns (i.e. it reads the entire sheet)
When POI reads spreadsheets, the entire data is stored in memory to allow random access to specific rows. Reading large files can therefore lead to out-of-memory exceptions.
POI does have a streaming API for reading XLSX but it is complex to use.
However, a third-party POI wrapper - excel-streaming-reader - now exists which makes this much easier.
Integrate this library and use it to support a new
readLargeFile()
method able to read large XLSX files with less likelihood of running out of memory.Limitations
read()
options, in particular specifying rows or columns (i.e. it reads the entire sheet)