roo-rb / roo

Roo provides an interface to spreadsheets of several sorts.
MIT License
2.78k stars 503 forks source link

Incorrectly parses last row with blank columns #535

Open superyarick opened 4 years ago

superyarick commented 4 years ago

roo (2.8.3) Parsing Excel file, in the last row is not parsed correctly. Columns get shifted to the left when blank fields are encountered. This issue does not happen to any other rows.

I am attaching the excel file with CIS Benchmark

@xlsx.sheet(1).each_row_streaming do |row|
          puts 'row  1      -  ' + row[1].to_s
          puts 'row  2      -  ' + row[2].to_s
          puts 'row  3      -  ' + row[3].to_s
          puts 'row  4      -  ' + row[4].to_s
          puts 'row  5      -  ' + row[5].to_s
          puts 'row  6      -  ' + row[6].to_s
          puts 'row  7      -  ' + row[7].to_s
          puts 'row  8      -  ' + row[8].to_s
          puts 'row  9      -  ' + ['9 - ',row[9]].join
          puts 'row  10      -  ' + ['10 - ',row[10]].join
          puts 'row  11      -  ' + ['11 - ',row[11]].join
          puts 'row.length -  ' + row.length.to_s
          puts 'tag_pos.cis_controls  -  ' + tag_pos['cis_controls'].to_s
end

CIS_Ubuntu_Linux_16.04_LTS_Benchmark_v1.1.0.xls.zip

It will fail on columns 9,10 and 11 on the last row. Columns 9 and 10 on the last row should be #<Roo::Excelx::Cell::Empty:0x00007f918713cc28> as any other row.

suung commented 4 years ago

We have a different issue here..

we have string values parsed as nil, while numbers work.

This only happens with Excel, with LibreOffice this problem does not exist.

buncis commented 2 months ago

can confirm this kind of happens to me, on the middle of row(2204) from total row 50001, one column is skipped

I think the problem is with each_row_streaming because it could parsed perfectly with normal row select instead streaming

I think this is related https://github.com/roo-rb/roo/issues/380