pjfanning / excel-streaming-reader

An easy-to-use implementation of a streaming Excel reader using Apache POI
Apache License 2.0
113 stars 27 forks source link

Exception parsing excel with formulas #80

Closed dmgladkov closed 2 years ago

dmgladkov commented 2 years ago

3.2.3 version

java.lang.IllegalStateException: EvaluationNames are not supported in excel-streaming-reader at com.github.pjfanning.xlsx.impl.BaseEvaluationWorkbook.getName(BaseEvaluationWorkbook.java:84) ~[excel-streaming-reader-3.2.3.jar:?] at com.github.pjfanning.xlsx.impl.CurrentRowEvaluationWorkbook.getName(CurrentRowEvaluationWorkbook.java:19) ~[excel-streaming-reader-3.2.3.jar:?] at org.apache.poi.ss.formula.FormulaParser.function(FormulaParser.java:1307) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.parseNonRange(FormulaParser.java:900) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.parseRangeable(FormulaParser.java:494) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.parseRangeExpression(FormulaParser.java:325) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.parseSimpleFactor(FormulaParser.java:1539) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.percentFactor(FormulaParser.java:1497) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.powerFactor(FormulaParser.java:1484) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.Term(FormulaParser.java:1858) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.additiveExpression(FormulaParser.java:1985) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.concatExpression(FormulaParser.java:1969) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.comparisonExpression(FormulaParser.java:1926) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.intersectionExpression(FormulaParser.java:1899) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.unionExpression(FormulaParser.java:1880) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:2027) ~[poi-5.1.0.jar:5.1.0] at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:173) ~[poi-5.1.0.jar:5.1.0] at com.github.pjfanning.xlsx.impl.StreamingSheetReader.handleEvent(StreamingSheetReader.java:361) ~[excel-streaming-reader-3.2.3.jar:?] at com.github.pjfanning.xlsx.impl.StreamingSheetReader.getRow(StreamingSheetReader.java:125) ~[excel-streaming-reader-3.2.3.jar:?] at com.github.pjfanning.xlsx.impl.StreamingSheetReader.access$400(StreamingSheetReader.java:41) ~[excel-streaming-reader-3.2.3.jar:?] at com.github.pjfanning.xlsx.impl.StreamingSheetReader$StreamingRowIterator.hasNext(StreamingSheetReader.java:687) ~[excel-streaming-reader-3.2.3.jar:?] ...

pjfanning commented 2 years ago

Can you provide a workbook that reproduces the issue? This may affect particular functions as opposed to all functions.

pjfanning commented 2 years ago

this appears to be associated with the new code to support shared formulas - you could revert to v3.2.0 to avoid this change - there is an unreleased feature to disable the new shared formula support (https://github.com/pjfanning/excel-streaming-reader/pull/79)

dmgladkov commented 2 years ago

Yeah! Thanks, 3.2.0 helped a lot. I need some time to provide you clean XLS file without the customer's data.

dmgladkov commented 2 years ago

Prepared the simplest workbook, reproducing the problem. test.xlsx

dmgladkov commented 2 years ago

BTW Is it possible to customize not to evaluate formulas at all? If we read the file, in most cases values are already calculated and cached. Sometimes (for instance, when values are loaded from some external unavailable resources) formulas evaluation will fail.

pjfanning commented 2 years ago

@dmgladkov excel-streaming-reader always takes the cell values that are cached in the xlsx file. The new code is an attempt to support Cell.getCellFormula. This is easy for most cells but some cells have formulas but reference another cell which has the formula (shared formulas) and this formula needs to be adjusted to suit the referencing cell. I released excel-streaming-reader v3.2.4 which disables this cell formula stuff again - it can only be enabled by setting a build property. I still need to fix the issue you ran into where shared formulas can rely on named ranges. Thanks for creating the test.xlsx.

pjfanning commented 2 years ago

@dmgladkov it's a bit of a hack but I've been able to get the code working the shared formulas in your workbook - so I'll close this - https://github.com/pjfanning/excel-streaming-reader/pull/83

It appears that you don't really need this fix so the comment above about v3.2.4 disabling all this by default still applies.