jmcnamara / libxlsxwriter

A C library for creating Excel XLSX files.
https://libxlsxwriter.github.io
Other
1.48k stars 330 forks source link

File creation performance #426

Closed duncangroenewald closed 8 months ago

duncangroenewald commented 8 months ago

Is there any way to improve the performance when creating a file. I am generating a 4.5mb XLS file and the creation process seems to be quite slow - around about a minute - given the size I would expect that could be done much faster than that.

I haven't done much investigation on what exactly could be the cause but it seems to be creation of the cells themselves that is slowing things down.

I figured I would ask just in case there might be some quick way to improve performance.

Thanks

jmcnamara commented 8 months ago

It shouldn't take anything like a minute to create a 4.5MB file.

Here is a sample program I ran as a test:

#include "xlsxwriter.h"

int main() {

    int max_row = 45000;
    int max_col = 50;

    lxw_workbook  *workbook  = workbook_new("c_perf_test.xlsx");
    lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);

    for (int row_num = 0; row_num < max_row; row_num++) {
        for (int col_num = 0; col_num < max_col; col_num++) {
            if (col_num % 2)
                worksheet_write_string(worksheet, row_num, col_num, "Foo", NULL);
            else
                worksheet_write_number(worksheet, row_num, col_num, 12345.0, NULL);

        }
    }

    workbook_close(workbook);

    return 0;
}

I put this in the examples directory of a repo clone and compiled it as follows:

make examples

This runs in around 2.5sec:

$ time ./examples/c_perf_test

real    0m2.424s
user    0m1.945s
sys 0m0.311s

And the output file is ~ 4.5MB:

$ ls -lh c_perf_test.xlsx
-rw-r--r--  1 John  staff   4.6M 23 Dec 10:19 c_perf_test.xlsx

This was on a 3.2 GHz 6-Core Intel Core i7 Mac mini with macOS 14.2.1 (Sonoma).

I'd suggest starting with the same example, verifying the performance, and if it is more or less the same then try to figure out what extra work your program is doing.

duncangroenewald commented 8 months ago

Thanks for that. I just looked into it a little more and I have to convert values from strings and the checks to determine the value type seem to be slowing things down. Nothing to do with this library.