boazsegev / combine_pdf

A Pure ruby library to merge PDF files, number pages and maybe more...
MIT License
734 stars 155 forks source link

how to copy pages ? (by value not reference) #159

Closed MathieuDerelle closed 5 years ago

MathieuDerelle commented 5 years ago

I was copying pages from one PDF to two others, rotating pages on the fly and came up with this unexpected behavior :

page = pdf1.pages[idx]

pdf2 << page
pdf3 << page

case rotation
when '0'
  # nothing to do
when '90'
  pdf2.pages.last.rotate_right
  pdf3.pages.last.rotate_right
when '180'
  pdf2.pages.last.rotate_180
  pdf3.pages.last.rotate_180
when '-90'
  pdf2.pages.last.rotate_left
  pdf3.pages.last.rotate_left
end

the rotation was applied twice, which means that the pages were still linked by reference

but the code of inject_page seems to indicate that objects should not be passed by reference

I tried

pdf2 << page.dup
pdf3 << page.dup

but the above code raised

*NoMethodError*: undefined method `rotate_right' for #<Hash:0x007f0989badfa8>

I ended up going for :

page = pdf1.pages[idx]

case rotation
when '0'
  # nothing to do
when '90'
  page.rotate_right
when '180'
  page.rotate_180
when '-90'
  page.rotate_left
end

pdf2 << page
pdf3 << page

but how to apply different rotations for the two different output PDF ?

boazsegev commented 5 years ago

Hi @MathieuDerelle ,

Thank you for opening this issue.

AFAIK, copy by reference is the default mode in Ruby, so I don't really see this as surprising behavior. However, I agree that the code in inject_page and the copy method might be seen as misleading (copy is the same as dup, but it makes sure that the page resources aren't PDF bound).

If you do need to copy a page by value, consider creating a new page and pasting the content of the existing page as an overlay one the ew page.

I didn't try this, so I'm not sure it would solve your issue, but look at:

org_page = pdf1.pages[idx] 
target_page = pdf2.new_page(org_page.mediabox)
target_page << org_page

Note that this approach might (and probably will) consume more memory, as some resources need to be copied.

Let me know how it goes.

Kindly, Bo.