zdavatz / spreadsheet

The Ruby Spreadsheet by ywesee GmbH
http://spreadsheet.ch
GNU General Public License v3.0
1.13k stars 240 forks source link

Cells with text >= 32KB in length result in MS Excel failing to open #250

Closed abrom closed 4 years ago

abrom commented 4 years ago

This isn't necessarily a bug per se with the gem, however the gem does create (without warning or error) an XLS file that does not open in MS Excel (tested a few different versions with the same result):

Screen Shot 2020-03-10 at 12 18 21 PM

Interestingly, opening the file with something like LibreOffice 'works', but saving the file results in the cell being truncated to 32KB - 1 (also without warning! doh).

FYI I couldn't see any reference to this limit in either of the file format spec documents attached on the project readme.

To replicate this (ignore the; nil.. that's just so my console pager doesn't flip out):

book = Spreadsheet::Workbook.new
sheet = book.create_worksheet name: 'test'
sheet[0, 0] = 'a' * (2**15 - 1); nil
book.write 'works_fine.xls'; nil

sheet[0, 0] = 'a' * 2**15; nil
book.write 'broken_read.xls'; nil

Not sure what the solution for this sort of thing might be.. comment in the readme? Write something to $stderr ?

zdavatz commented 4 years ago

Thank you.

  1. Who created the original File? My guess is you just modified the file.
  2. Is this on Mac or an PC?
abrom commented 4 years ago
  1. The file was created essentially by the means described above. A spreadsheet was created using the Spreadsheet gem which included a large block of text (> 32K long) in one of the cells. The resulting file would not open in Excel (but would in LibreOffice)

  2. I've tested creating the files using Mac and Linux and opening them with Excel and LibreOffice on Mac and Windows each with the same results.

abrom commented 4 years ago

My guess here is that Excel uses a 16 bit signed integer for storing the cell content size internally and when the size is greater than that the parser identifies it as being corruption.

My point with the issue is more just about how the Spreadsheet gem might better identify to users that there is an issue with the content being saved, or at worst maybe update the documentation to explain that limitation of Excel?

For my specific issue I've already put measures in place to better sanitise the data being saved, but there may be others who face a similar situation

zdavatz commented 4 years ago

Ok, thank you. Your explanation sounds totally plausible to me. Can you send me a Pull Request how you would mention that problem to new spreadsheet users in the README file?

zdavatz commented 4 years ago

both files open in OneDrive/Office 365. works_fine_broken.zip

abrom commented 4 years ago

Failures occur for me in Office 16 (and presumably anything before that).

Sure, will create a PR for the README :+1: