protobi / js-xlsx

XLSX / XLSM / XLSB (Excel 2007+ Spreadsheet) / ODS parser and writer
http://oss.sheetjs.com/js-xlsx
Other
833 stars 416 forks source link

Skip cells with incorrect link to shared strings #40

Closed rstrlcpy closed 7 years ago

rstrlcpy commented 8 years ago

Hello.

I'm trying to parse a XLSX file, but the parser does not parse sheet's data. I see workbook, i see sheet's name, but 'sheets' field is empty.

I did a research and found that 'parse_ws_xml_data' generates exception:

        /* 18.18.11 t ST_CellType */
        switch(p.t) {
          case 'n': p.v = parseFloat(p.v); break;
          case 's':
            sstr = strs[parseInt(p.v, 10)];                                                                                         
            p.v = sstr.t;
            p.r = sstr.r;
            if(opts.cellHTML) p.h = sstr.h;
            break;

exception on line 7495, because sstr is undefined, because 'p' contains only 't'.

I did a research of my file the cell (A1) that is causes exception has the following structure:

<row r="1" spans="1:9" customHeight="1" ht="35"><c r="A1" t="s"/><c r="B1" s="14" t="s"><v>0</v></c><c r="C1" s="1"/><c r="D1" s="1"/><c r="E1" s="1"/><c r="F1" s="1"/><c r="G1" s="1"/><c r="H1" s="1"/><c r="I1" s="1"/></row>

I tried to open the file by LibreOffice. Everything good. A1 cell does not contains anything.

I saved the opened file to other file. The new file was successfully parsed.

The first row in the new file does not have A1 cell.

<row collapsed="false" customFormat="false" customHeight="true" hidden="false" ht="35" outlineLevel="0" r="1"><c r="B1" s="1" t="s"><v>0</v></c><c r="C1" s="1"/><c r="D1" s="1"/><c r="E1" s="1"/><c r="F1" s="1"/><c r="G1" s="1"/><c r="H1" s="1"/><c r="I1" s="1"/></row>

The fix helps.

pietersv commented 8 years ago

Thanks @Ramzec ! Great catch. Rather than take this PR directly, I'll make the change to the relevant file in /bits/*.js and recompile.

sarahkevinking commented 7 years ago

@pietersv did your change get merged in? I'm using v0.8.13 and I'm having a similar issue where cells of type s with no value defined cause the sheet to not be parsed.

This issue is especially troublesome when using a tool like https://github.com/tealeg/xlsx to generate xlsx files that are later parsed by js-xlsx.

pietersv commented 7 years ago