catamphetamine / read-excel-file

Read *.xlsx files in a browser or Node.js. Parse to JSON with a strict schema.
https://catamphetamine.gitlab.io/read-excel-file/
MIT License
301 stars 52 forks source link

cannot parse xlsx file generated by "excel4node" #34

Closed duanyukai closed 5 years ago

duanyukai commented 5 years ago

xlsx file generate by https://github.com/natergj/excel4node cannot be parsed correctly. But using excel 2016, open the generated xlsx file and save again, this lib works well. I'm not sure if it's a bug related to xml version or some else?

catamphetamine commented 5 years ago

Post the excel file and describe what exactly is incorrect.

duanyukai commented 5 years ago

Post the excel file and describe what exactly is incorrect.

Sorry for that, I'm constructing a minimal sample.

duanyukai commented 5 years ago

gen.js: (copied from https://github.com/natergj/excel4node/blob/master/README.md)

// Require library
var xl = require('excel4node');

// Create a new instance of a Workbook class
var wb = new xl.Workbook();

// Add Worksheets to the workbook
var ws = wb.addWorksheet('Sheet 1');
var ws2 = wb.addWorksheet('Sheet 2');

// Create a reusable style
var style = wb.createStyle({
  font: {
    color: '#FF0800',
    size: 12,
  },
  numberFormat: '$#,##0.00; ($#,##0.00); -',
});

// Set value of cell A1 to 100 as a number type styled with paramaters of style
ws.cell(1, 1)
  .number(100)
  .style(style);

// Set value of cell B1 to 200 as a number type styled with paramaters of style
ws.cell(1, 2)
  .number(200)
  .style(style);

// Set value of cell C1 to a formula styled with paramaters of style
ws.cell(1, 3)
  .formula('A1 + B1')
  .style(style);

// Set value of cell A2 to 'string' styled with paramaters of style
ws.cell(2, 1)
  .string('string')
  .style(style);

// Set value of cell A3 to true as a boolean type styled with paramaters of style but with an adjustment to the font size.
ws.cell(3, 1)
  .bool(true)
  .style(style)
  .style({font: {size: 14}});

wb.write('test.xlsx');

parse.js:

const readXlsxFile = require('read-excel-file/node');

// File path.
readXlsxFile('test2.xlsx').then((rows) => {
  console.table(rows);
})

error below:

(node:18011) UnhandledPromiseRejectionWarning: TypeError: Cannot read property '0' of undefined
    at readXlsx (/home/dyk/Desktop/XlsxSample/node_modules/read-excel-file/commonjs/readXlsx.js:71:35)
    at readXlsxFileContents (/home/dyk/Desktop/XlsxSample/node_modules/read-excel-file/commonjs/readXlsxFileContents.js:22:38)
    at /home/dyk/Desktop/XlsxSample/node_modules/read-excel-file/commonjs/readXlsxFileNode.js:33:47
    at processTicksAndRejections (internal/process/task_queues.js:86:5)
(node:18011) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:18011) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

After open and save the xlsx file again in Excel 2016, this code works well.

┌─────────┬──────────┬──────┬──────┐
│ (index) │    0     │  1   │  2   │
├─────────┼──────────┼──────┼──────┤
│    0    │   100    │ 200  │ 300  │
│    1    │ 'string' │ null │ null │
│    2    │   true   │ null │ null │
└─────────┴──────────┴──────┴──────┘

Here are the two xlsx files.

samples.zip

catamphetamine commented 5 years ago

I fixed the unzipping code to skip directories, and also to not demand <workbookPr/> data. Now it seems to work. read-excel-file@4.0.2

The third column is a formula =A1 + B1. This library doesn't support formulae. Unzip both test1.xlsx and test2.xlsx and see xl/worksheets/sheet1.xml: for one of them it's <c r="C1" s="1"><f>A1 + B1</f></c> and for the other it's <c r="C1" s="1"><f>A1 + B1</f><v>300</v></c>. This library doesn't calculate formulae if there's no <v>alue. So the first file returns two columns and the second file returns three columns.