momijizukamori / bookbinder-js

A JS application to format PDFs for bookbinding.
Mozilla Public License 2.0
99 stars 26 forks source link

Bug: Handle large PDFs better #96

Closed acestronautical closed 2 months ago

acestronautical commented 4 months ago

PDF's larger than 100MB seems to generate out of memory issues on some/most machines. This isn't clear to the user and can the page seem like it has hung or is just not responsive.

We can tackle this from two angles: 1. surfacing memory errors better 2. improving performance bottlenecks

One option for improving performance here might be to force large PDF's to generate signature files only, or to have the aggregate PDF be generated as a merge of the signature files. There are probably other improvements we can make around saveAsBase64 and embedPagesInNewPdf.

sithel commented 4 months ago

have we verified that "signature files only" actually helps? My memory of the last time I tried to address memory bloat is that it still holds everything in memory/I wouldn't expect that to help...?

sithel commented 4 months ago

My vote is to prioritize 1. surfacing memory errors better first -- we currently have nowhere/no way to surface errors to the user (would prefer a non Alert dialog) -- this would help when communicating w/ users as we fight the battle

sithel commented 4 months ago

A shorter-term/easier work around IMO to solving the problem is to encourage users to slice their PDFs up into smaller processable sections. Having "page range" that the tool uses to process a subset of the PDF at a time would help with this if they don't have secondary PDF software of their own

acestronautical commented 4 months ago

Adding the upstream issues for tracking purposes: https://github.com/Hopding/pdf-lib/issues/197 https://github.com/Hopding/pdf-lib/issues/990 https://github.com/Hopding/pdf-lib/issues/121

acestronautical commented 4 months ago

In one test I did I was able to generate the signature files, but not the aggregate. I think the issue is total memory footprint, and generating only the signature files does not preview, and is less files total so I assume less total memory footprint.

sithel commented 4 months ago

ok, fair... could we make the Aggregate file out of gluing together the signatures at the end w/o ballooning the memory footprint? Aggregate is an important feature-- especially w/ a larger doc as printing each signature is a PITA -- so I don't want to overly rely on that...

but is good to know!

acestronautical commented 2 months ago

Closing since we made improvements, and I don't anticipate we will be able to improve this much more.