monitorjbl / excel-streaming-reader

An easy-to-use implementation of a streaming Excel reader using Apache POI
Apache License 2.0
946 stars 343 forks source link

Formulas not read on Formula Cells #200

Open esrieh opened 5 years ago

esrieh commented 5 years ago

if I'm not completely wrong, when reading the attached example XLSX File with version 2.10, some (but not all!) of the formulas on first sheet, fetched by StreamingCell.getCellFormula() are returned as zero-length Strings. It seems the formulas are not properly read in each case by the StreamingReader, even if the cell itself gets marked as CellType "formula". In the attached example the formulas at cells B5, A8 and A9 are properly read, the formulas at cell C5, D6 and A10, A11... are not The example implementation (see below) outputs for me (cells B5 and C5):

[...]
read template cell 4 1
     celltype FORMULA
     type n
     isFormulaType true
     formula content SUM(B8:B107)
     ♥♥♥♥♥♥♥♥♥ FINE! ♥♥♥♥♥♥♥♥♥
read template cell 4 2
     celltype FORMULA
     type n
     isFormulaType true
     formula content 
     !!!!!!!!! MISSING FORMULA! !!!!!!!!!
[...]

Example File example.xlsx

Java Code

import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;

import com.monitorjbl.xlsx.StreamingReader;

public class XLSXFormulaExampleViaStream {

    public static void main(String[] args) throws Exception {

        File f = new File(".\\example.xlsx");

        InputStream inputStream = new FileInputStream(f);   
        Workbook inputWorkbook = 
                StreamingReader.builder()
                .rowCacheSize(100)
                .bufferSize(4096)
                .open(inputStream);

        for (Sheet inputSheet : inputWorkbook){

            System.out.println("handling inputSheet " + inputSheet.getSheetName());

            for (Row inputRow : inputSheet) {

                System.out.println("\t read inputRow " + inputRow.getRowNum());

                for (Cell inputCell : inputRow) {

                    System.out.println("\t\t read template cell " + inputRow.getRowNum() + " " + inputCell.getColumnIndex());
                    System.out.println("\t\t\t celltype " + inputCell.getCellType());
                    System.out.println("\t\t\t type " + ((com.monitorjbl.xlsx.impl.StreamingCell)inputCell).getType());
                    System.out.println("\t\t\t isFormulaType " + ((com.monitorjbl.xlsx.impl.StreamingCell)inputCell).isFormulaType());

                    if (inputCell.getCellType() == CellType.FORMULA) {
                        System.out.println("\t\t\t formula content " + inputCell.getCellFormula()); 

                        if (inputCell.getCellFormula().equals("")) {
                            System.out.println("\t\t\t !!!!!!!!! MISSING FORMULA! !!!!!!!!!"); 
                        } else {
                            System.out.println("\t\t\t \u2665\u2665\u2665\u2665\u2665\u2665\u2665\u2665\u2665 FINE! \u2665\u2665\u2665\u2665\u2665\u2665\u2665\u2665\u2665"); 
                        }
                    }

                    if (inputCell.getCellType() == CellType.STRING) {
                        System.out.println("\t\t\t string content " + inputCell.getStringCellValue()); 
                    }

                    if (inputCell.getCellType() == CellType.NUMERIC) {
                        System.out.println("\t\t\t numeric content " + inputCell.getNumericCellValue()); 
                    }
                }
            }
        }
    }
}
nick-burch-flec commented 4 years ago

Most likely this is a shared formula

XSSF has special code to handle this, which basically works by getting the base formula, then shifting it appropriately to the location of the cell

The XSSF code is at https://github.com/apache/poi/blob/trunk/src/ooxml/java/org/apache/poi/xssf/usermodel/XSSFCell.java#L482 - you'd probably need to implement something similar for the streaming reader too

pjfanning commented 2 years ago

I added a fix for this in my fork and released v3.2.1. I will update pjfanning/excel-streaming-reader#73 with descriptions of how it works but it's pretty much automatic (in normal case).