Open OpenNingia opened 9 years ago
+1 A way to flatten a form would be excellent. I would like to avoid having another dependency for my code, which uses PyPDF2. But shipping filled in forms around the interwebz creates problems with a variety of vendors and their [I assume not based on PyPDF2] software.
It would be great if PyPDF2 had the ability to fill in forms and flatten them!
I also would really appreciate this
(In progress) We can accomplish this by setting Bit Position 1 of the field flags.
Ref: Table 8.70 of PDF 1,7 spec
Setting a field read-only might be a way, however pdftk works differently; afaik it replaces each /Field instance with a simple text object. :confused:
You're right, that's the better option. Should be able to implement that soon
I agree. This would be totally awesome!
Is there any update on this? I am looking to use an editable pdf as a template which will be filled by code.
I'm with @jamoham on this... for the same exact use case.
+1
Any update on this?
Can you flatten a file with PyPDF2 yet? I've not found anything on this being implemented.
I do see some code to _flatten in the PdfFileReader, but not in the writer. Will someone be taking a swing at this?
I have exactly the same scenario as mentioned by @jamoham, @kherrett and @zhiwehu above. Has there been any progress on either being able to flatten a PDF, or set the fields as read-only?
Rough bit of code if anyone needs to set fields to read-only prior to an update to the module (assumes you imported the whole module as PyPDF2). Works in a similar fashion to the existing updatePageFormFieldValues()
method.
class PDFModifier(PyPDF2.PdfFileWriter):
'''Extends the PyPDF2.PdfFileWriter class and adds functionality missing
from the PyPDF2 module.'''
def updatePageFormFieldFlags(self, page, fields, or_existing=True):
'''
Update the form field values for a given page from a fields dictionary.
Copy field flag values from fields to page.
:param page: Page reference from PDF writer where the annotations
and field data will be updated.
:param fields: a Python dictionary of field names (/T) and flag
values (/Ff); the flag value should be an unsigned 32-bit integer
(i.e. a number between 0 and 4294967295)
:param or_existing: if there are existing flags, OR them with the
new values (default True)
'''
# Iterate through pages and update field flag
for j in range(0, len(page['/Annots'])):
writer_annot = page['/Annots'][j].getObject()
for field in fields:
if writer_annot.get('/T') == field:
if or_existing:
current_flags = writer_annot.get('/Ff')
if current_flags is not None:
fields[field] = int(bin(current_flags | fields[field]),2)
writer_annot.update({
PyPDF2.generic.NameObject("/Ff"): PyPDF2.generic.NumberObject(fields[field])
})
+1 for flattening, such as in pdftk!
+1 for a method for flattening pdfs
@mstamy2 , @OpenNingia
One thing I noticed with the approach of flattening/making forms read-only by setting the field flag bit to 1: when I try to merge resulting PDFs, only the values from the first document make it to the merged file. I don't think this is expected behavior.
Cross-posting this useful recipe by @Redjumpman: https://github.com/mstamy2/PyPDF2/issues/506
Remember to update the form field name if you want to merge multiple documents made from the same template form. Else, the merged PDF result will have identical pages due to each document sharing the same field names.
PdfWriter.append()
should provide you with capability to add pages with data fields.
Can you confirm that this issue can get closed?
without feed back I close this issue as fixed. Feel free to provides updates if yuo wan to reopen it.
I don't think the original issue is closed: how do you make fields non-editable easily? The use case being taking a PDF with editable forms, filling out the forms and outputing a PDF with non-editable fields.
the read-only flag defined here in the Pdf 1.7 reference (page 676)
therefore you have to set the flags. Below an example setting all the fields in readonly:
import pypdf
r = pypdf.PdfReader("input_form.pdf")
for f,v in r.get_fields().items():
o=v.indirect_reference.get_object() # this will provide access to the actual PDF dictionary
o[NameObject("/Ff")] = NumberObject( o.get("/Ff",0)|1)
w = pypdf.PdfWriter()
w.clone_document_from_reader(r)
w.write("output_form.pdf")
What you are suggesting is not "flattening" thou. The output pdf will still present data fields (widgets) . Flattening as pdftk does is replacing the data field with text.
@OpenNingia Can you provide a non-flat PDF file and its flattened version for review?
Multiple pdf merged and flattened: Ichiro Yasuhigo.pdf
One of the editable source: sheet_all.pdf
The flattening process is quite tough to compute (create XOBject with the good characteristics) modify the content to place them. I see personnally very limited advantage vs time to implement an for me the readonly alternative could be sufficient ; I will have no time to propose a PR. Any candidate ?
since we have now #1864, flattening should be quite simple
Can someone please provide a simple code snippet here for flattening a pdf?
I have subclassed the PdfWriter
class to be able to flatten forms here, so it can be done.
Would you accept PR for this, and do you have any idea of the interface which would be best for implementation?
I think this would be the easiest option, or there could be something more advanced, where you pass a list to be flattened, but all is the default, but I wouldn’t want to go too far on this. https://gist.github.com/matsavage/a50d9c541957f276088c341cc84a9e7f
@matsavage your code seems to have some good idea your function should integrate PdfWriter. In order to ease you should fork pypdf and build a branch with your mods : this will ease its merging.
What you should try is to convert the global ["/AP"]["/N"]
into an XForm (that way you will not worry about merging the resources, drawing and so on into the page) and just add in the main page content a cm operation to do the translation to the proper rectangle, call the new XForm with Do operator : this should fit with all type of widgets
I only did things this way to see if the flattening could be done, to save the effort of setting up the development environment on my machine, this is more the template than the PR
Thanks for the advice, I’ll try and have a look at this some time
Looking forward 😊
Looking forward for this feature!
Honestly I haven’t been able to look at this since May, feel free to have your own attempt at implementing it if it’s something you need.
At your marks.... get set ... go! 😉😄😄😄
At your marks.... get set ... go! 😉😄😄😄
I think it’s the one everyone wants, but no one wants to do
Honestly I haven’t been able to look at this since May, feel free to have your own attempt at implementing it if it’s something you need.
Darn, I wouldn't even know where to start 🥴
pdftk provides the feature to embed the form fields' text in the pdf itself. This is very useful if you want to use an editable pdf as a template to be filled by code.
from the pdftk manual:
usage example: