boazsegev / combine_pdf

A Pure ruby library to merge PDF files, number pages and maybe more...
MIT License
734 stars 155 forks source link

CombinePDF#pages - individual page sizes same as entire original file #172

Closed leviwilson closed 4 years ago

leviwilson commented 4 years ago

We've been using CombinePDF to "split" pages (for some post-processing) like so:

CombinePDF.load(file, allow_optional_content: true).pages.each_with_index do |page, page_number|
  path = "page_#{page_number.to_s.rjust(4, 0')}.pdf"
  (CombinePDF.new << page).save(path)
  Page.create file: File.open(path)
end

For some documents we've had, their file size might be 32MB though when we split with #pages, each individual page ends up being the same size as the original document. This wouldn't be that big of a deal, though this example has 865 pages (so 865 ✖️ 32MB is a lot).

If I had a non-PII example to add to this issue I would, but unfortunately I don't at this time.

This is more of a question than an issue (I think) but thought I'd ask to see what about this document is special to where this happens.

leviwilson commented 4 years ago

Closing this issue. It appears to be the same result using poppler utils as well; I'm guessing each page references elements in the document and continues to re-use them per page.

Isn't an issue with combine_pdf so closing this.