Performance and handling a big set of data

StriderRanger commented 2 years ago

When generating a pdf with 100000 rows using jspdf-autotable the app occupies more than 1Gb of memory. This size doesn't get lower even after that the pdf document is generated. The memory overload begins when calling autoTable, so it would be very helpful if there is a process that would free the memory when the generation is over. Thanks for the great work.

simonbengtsson commented 2 years ago

Interesting! It would sure be nice if someone took a closer look at this. I'm thinking the first action would be to verify the issue by posting example code and method for how to reproduce it.

StriderRanger commented 2 years ago

Thank you for your answer. I will work on an example and post it later, it will surely help to look further into the problem.

StriderRanger commented 2 years ago

I reproduced the performance problem with this example.

There is an 'Export PDF' button, when clicked the code performs 3 mains steps:

Initialize PDF settings
Generate 100.000 items of data
Call autoTable and generate the PDF

The memory could be monitored via the Task Manager while the generation is running:

While the generation is running, we can notice that the memory space that is occupied by the browser is increasing, and doesn't decrease to its initial value when the generation is over.
Every time a new generation is started (by clicking again on the export button), we can notice that the memory space is still increasing to an even higher values.

There is some logs to show the current step in the console, but when the button is clicked, the browser window freezes, and everything is showed in the console at once when the generation is over.

I hope these initial steps can give a good ground to investigate further into this problem, as it is critical to solve when dealing with a big set of data.

ismailhunt commented 2 years ago

I counter the same issue years ago, try generate 100,000++ items of data. I discover all the process will be done locally at client side computer and it related with RAM the client side PC use.

On recent version pdfjs already optimize it, check your code and try optimize it. For example, you can try split the data half and make two output then combine it back.

hendrickson-tyler commented 1 year ago

Our team is running into both sides of this issue as well—large memory usage and memory that isn't relinquished after generating the PDF. Definitely looking forward to a fix, it would be great if the library handled this under the hood. Thank you to everyone contributing!

matt-in-brissy commented 1 year ago

Yep, having the same issues - heap size is good, but RSS grows to multiple gigabytes of memory used.

simonbengtsson / jsPDF-AutoTable

Performance and handling a big set of data #840