mfenniak / pyPdf

Pure-Python PDF Library; this repository is no longer maintained, please see https://github.com/knowah/PyPDF2/ insead.
https://github.com/knowah/PyPDF2/
Other
276 stars 85 forks source link

internal links not preserved when encrypted #4

Open iffy opened 14 years ago

iffy commented 14 years ago

Steps to Duplicate:

  1. Obtain a PDF with internal links (you click the link and it takes you to another page in the PDF)
  2. Encrypt the PDF with the function below
  3. Open the PDF and see that the links no longer work.

def encrypt(in_stream, out_stream, user_password, owner_password=None): """ Encrypt an existing PDF file (stream)

    `in_stream`         stream with pdf data
                        open(filename, 'rb')
    `out_stream`        stream where output will be written
                        open(filename, 'wb')
    `user_password`     the password used for limited access
    `owner_password`    the password used for full access (defaults to user_password)

    I copied this from /sm/script/encryptPdf.py
    """
    reader = PdfFileReader(in_stream)
    writer = PdfFileWriter()
    for i in range(reader.getNumPages()):
        writer.addPage(reader.getPage(i))
    writer.encrypt(user_password, owner_password)
    writer.write(out_stream)

cyberixae commented 11 years ago

The bug seems to be present even when you don't encrypt the pdf. I have used PyPdf for watermarking like the example on the front page. That type of usage breaks internal links too, but I also tried simply iterating pages from input, adding them to output and writing the file out. That breaks internal links as well. My best guess is that PyPdf will simply break internal links under all circumstances. Has someone been able to figure out the root cause to this?