Closed saranshbansal closed 3 years ago
A sheet is effectively a row iterator, so reading arbitrary rows by index cannot be supported. This library reads rows in-order as they are defined in the XML, it cannot jump around to different rows. If you need that ability, you should use vanilla POI.
The same is true of "empty" cells and rows not being read; if they are empty, they are not actually in the XML and so there is nothing to be read. The row and cell numbers should be available on all entries though, so it's relatively trivial to detect when there are gaps between entries. Those gaps are the empty rows/cells you're describing.
By that logic, isn't a Row just a Cell iterator? Then why does row.getCell(0) works but former doesn't?
Anyway, I worked out something:
try (Workbook wb = StreamingReader.builder()
.rowCacheSize(DEFAULT_ROW_CACHE_SIZE)
.bufferSize(DEFAULT_BUFFER_SIZE)
.open(inputStream)) {
Sheet firstSheet = wb.getSheetAt(0);
int noOfColumns = 0;
int rowIndex = 0;
for (Row nextRow : firstSheet) {
if (rowIndex == 0) {
noOfColumns = nextRow.getLastCellNum();
}
for (int i = 0; i < noOfColumns; i++) {
Cell cell = nextRow.getCell(i);
sb.append(cell != null ? cell.getStringCellValue() : EMPTY).append(delimiter);
}
sb.append(DEFAULT_LINE_END);
++rowIndex;
}
}
By that logic, isn't a Row just a Cell iterator? Then why does row.getCell(0) works but former doesn't?
Excellent question. A row is read in its entirety, a sheet is not. The XML document is structured like so:
sheet
└ row
└ cell
└ row
└ cell
└ row
└ cell
For performance reasons, this library will read ahead a configurable number of rows and cache them for your code to iterate over. Not reading the entire row would require that only one row is processed at a time because we can't skip ahead anymore.
Okay fair enough. The solution is working fine without any performance drop so closing the ticket.
sheet.getRow(0) is unsupported. Can you please provide an alternative to read header row?
It's absolutely necessary in order to fix the issues with EMPTY cells/rows not being read by default.