Leftbower / cfspreadsheet-lucee-5

A cfspreadsheet extension for Lucee 5
23 stars 10 forks source link

spreadsheetAddRow times out on large data sets #16

Closed scottdoc closed 1 year ago

scottdoc commented 6 years ago

Hello. We are having an issue where generating a .xls spreadsheet is causing the CPU usage to spike massively and then timing out. This is on the lastest Lucee 5.2.7.62 (as well as 5.2.6.60) and cfspreadsheet 3.0.1.

The spreadsheet is created by looping over a query with 27 columns and 1500 rows. And calling spreadsheetAddRow(). The columns have a variety of data types, dates, numeric etc.

I just tested on our older railo server, with an older version of cfspreadsheet - the same code generates a spreadsheet in seconds.

My initial theory was that it may have to do with writing into the output buffer, as when it times out I get a blank page with hundreds of thousands of blank lines - I have checked and our code does not have any output="true" calls. I have tried setting enablecfoutputonly, but this does not solve it.

Leftbower commented 6 years ago

Have you checked the logs to see what is going on? The code really has not changed for this extension from Railo to Lucee 5, just the code needed for wrapping it up to install as a Lucee 4.x and 5.x extension. There was an update to newer Apache POI jars.

scottdoc commented 6 years ago

Hi - yes I have.

It took some investigating, but I worked out that the issue is the autoSizeColumnFix() that happens after every addRow call. This really affects performance if there are thousands of rows, added one at a time, and they have some columns with a lot of string data.

This code is not here, sorry, but in the cfpoi project: https://github.com/Leftbower/cfspreadsheet-lucee/blob/master/cfpoi/org/cfpoi/spreadsheet/Spreadsheet.cfc

The workaround, at least from my side, is to not use spreadsheetAddRow but to instead build a query object and then pass it to spreadsheetAddRows.

jpetilo commented 1 year ago

@scottdoc i was face the same problem with a query with many columns and based in your suggestion i've found my solution, the following code:

spreadsheetAddRows(spreadsheet=LOCAL.mysheet,data=LOCAL.mylargeqry,autoSizeColumns=false);

i was getting timeout in cloudflare because the code took more than 100 seconds. And with the change of processing time dropped to 5s. Thank for everyone.