chinapandaman / PyPDFForm

:fire: The Python library for PDF forms.
https://chinapandaman.github.io/PyPDFForm/
MIT License
438 stars 19 forks source link

PPF: Repeated field names in PDF form #733

Open jrhorne opened 1 month ago

jrhorne commented 1 month ago

Version

PyPDFForm=4.3.1

Issue Description

Hello, I'm trying to fill the attached PDF. I've included the input PDF and the output using the preview stream. I am trying to fill in the fields in the table, but having difficulty as it looks like they are all named the same.

Is there a way to fill this PDF with values in the table coming from a Pandas data frame (I can loop and map the attributes, but not sure how to do the filling)?

Thanks!

Code Snippet

from PyPDFForm import PdfWrapper

preview_stream = PdfWrapper("Cost_Basis_Update_TEST.pdf").preview

with open("output.pdf", "wb+") as output:
    output.write(preview_stream)

PDF Form Template

Cost_Basis_Update_TEST.pdf output.pdf

Screenshots (if applicable)

image
chinapandaman commented 1 month ago

Hey thanks for posting.

If you want to fill distinct data in each row, you will have to modify the name of these fields using a PDF form editor (e.g., Adobe Acrobat). Once done you can do something like this:

data = {}
for i, description in enumerate(your_data):
    data[f"Description[{i}]"] = description

PdfWrapper("Cost_Basis_Update_TEST.pdf").fill(data)
chinapandaman commented 1 month ago

Also in v1.4.36 that just got released, I implemented the functionality that lets you modify the key of a widget. So you could do something like this:

    for i in range(1, 10):
        pdf.update_widget_key("Description[0]", f"Description[{i}]", 1)
        pdf.update_widget_key("symbol[0]", f"symbol[{i}]", 1)
        pdf.update_widget_key("tradedate[0]", f"tradedate[{i}]", 1)
        pdf.update_widget_key("settlementdate[0]", f"settlementdate[{i}]", 1)
        pdf.update_widget_key("quantity[0]", f"quantity[{i}]", 1)
        pdf.update_widget_key("costperunit[0]", f"costperunit[{i}]", 1)
        pdf.update_widget_key("costabasis[0]", f"costabasis[{i}]", 1)

This would update the keys of each row so that you could fill distinct data.

Note that currently the performance of the above is a bit of an issue. But I'm about to go on vacation so I will have to get back to it later. In the meantime I suggest you run the above as a one time thing and save the PDF as a new template.

jrhorne commented 3 weeks ago

Great, I appreciate the help. I'll try this out