pmaupin / pdfrw

pdfrw is a pure Python library that reads and writes PDFs
Other
1.87k stars 273 forks source link

read, then immediately write, does not preserve info #181

Open RWT3330 opened 5 years ago

RWT3330 commented 5 years ago

I have a pdf file with a fillable form. I fill out the form and save the file. I can view the file with any number of pdf viewer/editors and the fields I added to the form are there. Yet when I simply use pdfrw.PdfReader to read it (the filled out pdf), then immediately invoke pdfrw.PdfWriter to write it back out, all the fields I previously filled out are empty in the output file. Why would this happen?

techNoSavvy-debug commented 4 years ago

Were you able to figure out a solution?

walkadog commented 4 years ago

No.

andrewmr commented 2 years ago

I am experiencing this problem right now. I'm filling out forms across multiple PDFs, then later I PdfReader() them in to assemble them into one giant PDF and the forms have all blanked out.

andrewmr commented 2 years ago

I think this might be because writing the .pages doesn't carry over the Root of the document that's being written. I'm not sure how one would go about copying over the elements (like AcroForm)

andrewmr commented 2 years ago

I discovered this on another comment, thanks @starlabs007 : https://stackoverflow.com/questions/57008782/pypdf2-pdffilemerger-loosing-pdf-module-in-merged-file

This works for me.