metaist / pdfmerge

Command-line utility for merging, splicing, and rotating PDF documents.
http://metaist.github.io/pdfmerge/
Other
121 stars 24 forks source link

Merging generates some blank pages instead of source file #32

Closed dcanl closed 1 month ago

dcanl commented 1 month ago

Hello,

pdfmerge() function replaces some files with a blank page in the destination file for no reason. This behavior is totally random: same files in input can generate 1 to many blank pages in output file, or sometimes none.

I am currently trying with 4005 files in input, representing 4296 total pages. All input files are correct, the list contains all files path. Same problem appears with fewer files / pages.

Here is my code:

for root, directory, files in os.walk(pathIn):
    for file in files:
        if file.endswith('.pdf'):
            pdfList.append(os.path.join(root, file))

pdfmerge(pdfList, pathOut+"\\"+fileOut)

Did you ever encounter this problem? Is there a solution?

metaist commented 1 month ago

Huh. Never seen anything like this. I'm assuming pathOut and fileOut are where you want to write the output pdf to.

Just as a sanity check, does this code produce the same or different results:

from pathlib import Path # for python > 3.4

pdfList = [str(p) for p in sorted(Path(pathIn).glob("*.pdf"))]
pdfmerge(pdfList, str(Path(pathOut) / fileOut))
dcanl commented 1 month ago

Thank you for your reply.

Result is the same...

metaist commented 1 month ago

Very weird. pdfmerge just wraps pypdf. Is it possible to see some more detail about how these PDFs are being generated?

dcanl commented 1 month ago

Sadly no, files are generated by an AS/400, and hum... AS/400!! 🫠

Suprisingly, using pypdf directly doesn't seem to produce any blank page... I will use that for now.

metaist commented 1 month ago

Understood. Sorry we couldn't figure out where the blank pages were coming from.

Closing this issue for now. Will re-open if anyone else experiences the same thing.