jlolling / talendcomp_tFileExcel

Talend components tFileExcel* to read and write Excel documents
Apache License 2.0
13 stars 7 forks source link

Issue opening a large workbook with 13.9 #42

Open FLawrence opened 1 month ago

FLawrence commented 1 month ago

Moved from tFileExcelComponents 13.6 to 13.9 and got the following error:

ERROR: Intialize workbook from file failed: Tried to allocate an array of length 165,498,084, but the maximum length for this record type is 100,000,000.
If the file is not corrupt and not large, please open an issue on bugzilla to request
increasing the maximum allowable size for this record type.
You can set a higher override value with IOUtils.setByteArrayMaxOverride()
org.apache.poi.util.RecordFormatException: Tried to allocate an array of length 165,498,084, but the maximum length for this record type is 100,000,000.
If the file is not corrupt and not large, please open an issue on bugzilla to request
increasing the maximum allowable size for this record type.
You can set a higher override value with IOUtils.setByteArrayMaxOverride()
    at org.apache.poi.util.IOUtils.throwRFE(IOUtils.java:599)
    at org.apache.poi.util.IOUtils.checkLength(IOUtils.java:276)
    at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:230)
    at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:203)
    at org.apache.poi.openxml4j.util.ZipArchiveFakeEntry.<init>(ZipArchiveFakeEntry.java:82)
    at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:98)
    at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:132)
    at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:319)
    at org.apache.poi.ooxml.util.PackageHelper.open(PackageHelper.java:59)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:290)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:286)
    at de.jlo.talendcomp.excel.SpreadsheetFile.initializeWorkbook(SpreadsheetFile.java:457)
    at mod.mod_create_procat_load_spreadsheet_0_9.MoD_create_PROCAT_load_spreadsheet.tFileExcelWorkbookOpen_2Process(MoD_create_PROCAT_load_spreadsheet.java:15567)
    at mod.mod_create_procat_load_spreadsheet_0_9.MoD_create_PROCAT_load_spreadsheet.tJava_1Process(MoD_create_PROCAT_load_spreadsheet.java:17254)
    at mod.mod_create_procat_load_spreadsheet_0_9.MoD_create_PROCAT_load_spreadsheet.runJobInTOS(MoD_create_PROCAT_load_spreadsheet.java:17834)
    at mod.mod_create_procat_load_spreadsheet_0_9.MoD_create_PROCAT_load_spreadsheet.main(MoD_create_PROCAT_load_spreadsheet.java:17481)

The file in question that it was trying to read is fairly big (279345x14, 17Mb). When we rolled back to use tFileExcelComponents 13.6 again it loaded without an issue.

jlolling commented 4 weeks ago

This is a new behaviour of the underlaying POI api. I was not aware of it. I will add an option to use the described method directly in the component. But for the time being you could try to run this command simply in a tJava component before you start loading the file. IOUtils.setByteArrayMaxOverride(200000000);