agmen-hu / node-datapumps

Node.js ETL (Extract, Transform, Load) toolkit for easy data import, export or transfer between systems.
MIT License
291 stars 38 forks source link

ExcelWriterMixin missing Number #9

Closed AndrzejSulej closed 9 years ago

AndrzejSulej commented 9 years ago

writeRow method uses only String method to write cell - it's very confusing. It would be better to check whether value is the Number ( or NaN ) or expect that column is an object with keys; value, type (String/Number/Formula), style etc ?

if (isNaN(value)){
    this._excel.worksheet.Cell(this._excel.currentRow, index + 1).String(value);
} else {
    this._excel.worksheet.Cell(this._excel.currentRow, index + 1).Number(value);
}

Another problem is the low write performance: 10columns x 5000 rows => (> 20sec). excel4node is the fastest module ? Have you any experience in another one.

Anyway, thank you for datapumps.

novaki commented 9 years ago

Released 0.3.23, which adds .columnType method to the pump, and also enables setting column types in .writeHeaders. Regarding performance, we also experience the bad performance of excel4node in terms of both memory and processor usage. See http://agmen-hu.github.io/node-datapumps/docs/mixin/ExcelWriterMixin.html for details.

I've checked a number of excel modules before choosing excel4node. This module seemed to be easy to use and working, i didn't check performance. I will check now, however I think that most modules will build the XML DOM and the zip (xlsx is a zip containing xmls) in the memory, and writing large files that way will always consume lots of memory and cpu. If you only export one worksheet in your excel, you may export to csv and then convert it to excel with a command line tool.

I close this bug now, but will open a new one for perfomance issues.