xavier / xlsx_reader

A production-ready XLSX file reader for Elixir.
106 stars 18 forks source link

Empty rows are skipped even with empty_rows: true #10

Closed ZombieHarvester closed 2 years ago

ZombieHarvester commented 2 years ago

Setup: empty sheet with a value in B2 cell

image file: xlsx_reader_empty_row_test.xlsx

iex(14)> {:ok, package} = XlsxReader.open("xlsx_reader_empty_row_test.xlsx")
{:ok, ...

Current behaviour

iex(15)> XlsxReader.sheet(package, "Sheet1", empty_rows: true)
{:ok, [["", "b2"]]}

Expected behaviour

iex(15)> XlsxReader.sheet(package, "Sheet1", empty_rows: true)
{:ok, [[...], ["", "b2"]]}

First row is ignored with empty_rows: true.

I need to keep empty rows because l need to transform [["", "b2"]] into %{2 => %{"B" => "b2"}} or similar so I could address cells by row/column.

Any advice?

xavier commented 2 years ago

Looks like your spreadsheet writer completely omitted the first row from the sheet in the XML. While sparse rows (i.e. "missing" cells) are currently supported, this case is not yet handled. The worksheet parser has to be modified to emit the missing rows when empty_rows: true. I'll take a look...

ZombieHarvester commented 2 years ago

The file was created in Excel for Mac.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\r\n<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
  xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
  xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac xr xr2 xr3"
  xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"
  xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision"
  xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2"
  xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" xr:uid="{E2C0E627-1487-1B42-AB7C-A5846AC496C5}">
  <dimension ref="B2"/>
  <sheetViews>
    <sheetView tabSelected="1" workbookViewId="0">
      <selection activeCell="B2" sqref="B2"/>
    </sheetView>
  </sheetViews>
  <sheetFormatPr baseColWidth="10" defaultRowHeight="13" x14ac:dyDescent="0.15"/>
  <sheetData>
    <row r="2" spans="2:2" ht="14" x14ac:dyDescent="0.15">
      <c r="B2" t="s">
        <v>0</v>
      </c>
    </row>
  </sheetData>
  <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>
ZombieHarvester commented 2 years ago

@xavier Wow, that was quick! Thanks!