py-pdf / pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
https://pypdf.readthedocs.io/en/latest/
Other
8.29k stars 1.4k forks source link

merge_page under doesn't work #2649

Closed theottm closed 5 months ago

theottm commented 5 months ago

Hi!

I'm trying to stack multiple pages vertically on top one other on a single page. I use the over=False option but only the first page appears.

Environment

$ python -m platform
Linux-6.2.6-76060206-generic-x86_64-with-glibc2.35

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==4.2.0, crypt_provider=('cryptography', '41.0.4'), PIL=9.2.0

Code + PDF

This is a minimal, complete example that shows the issue:

import pypdf
import sys

input_filename = sys.argv[-2]
input_pdf = pypdf.PdfReader(input_filename)

output_filename = sys.argv[-1]
output_pdf = pypdf.PdfWriter()

output_pdf.add_page(input_pdf.pages[0])
monopage = output_pdf.pages[0]

for page_number in range(1, len(input_pdf.pages)):
    print(page_number)
    monopage.merge_page(input_pdf.pages[page_number], over=False)

output_pdf.write(output_filename)

Then run:

python3 main.py in.pdf out.pdf

in.pdf

pubpub-zz commented 5 months ago

merge_page is overlapping (Z order). You need to first rescale and move the pages to put then next to each others on a blank page

pubpub-zz commented 5 months ago

at first site, you should find some ideas here: https://github.com/py-pdf/pypdf/discussions/1687#discussioncomment-5218716

theottm commented 5 months ago

Great, thank you for the hint!

So this solves it:

import pypdf
import sys

input_filename = sys.argv[-2]
input_pdf = pypdf.PdfReader(input_filename)

output_filename = sys.argv[-1]
output_pdf = pypdf.PdfWriter()

output_pdf.add_page(input_pdf.pages[0])
monopage = output_pdf.pages[0]
insertion_point = monopage.mediabox.height

for page_number in range(1, len(input_pdf.pages)):
    print(page_number)
    page = input_pdf.pages[page_number]
    monopage.merge_translated_page(page, 0, - insertion_point, expand=True)
    insertion_point += page.mediabox.height

output_pdf.write(output_filename)