chinapandaman / PyPDFForm

:fire: The Python library for PDF forms.
https://chinapandaman.github.io/PyPDFForm/
MIT License
438 stars 19 forks source link

If PDF has same field name in two pages, the second page field is no longer accessible #705

Open AaronWGoh opened 4 months ago

AaronWGoh commented 4 months ago

Version

PyPDFForm=1.4.31

Issue Description

When there are multiple pages with the same field name, the field is no longer able to refer to the second value and only changes the first value so the values don't change much across the different versions.

Code Snippet

wrapper = PdfWrapper(input_pdf_path)
fields = wrapper.sample_data.keys()

Note the full qualified path name is F[0].#subform[9].RadioButtonList2[0] & F[0].Page_10[0].RadioButtonList2[0] but as the full path is not considered, the field now refers to the earlier value by excludes the second value.

PDF Form Template

file.pdf

Screenshots (if applicable)

image
chinapandaman commented 4 months ago

The full path name you were referring to is based on the /NM attribute of the widget, which is not what the library refers to for widget keys. Instead the /T attribute of the parent, or the radio button group is used for the key, mainly for two reasons:

1) Not all widgets have the /NM attribute. 2) Most PDF editing tools also use the /T attribute for form widget keys, which made the library intuitive for most mainstream PDF forms. See attached screenshots for examples using DocFly.

I might come back to this issue in the future to implement some precedence mechanism for /NM attribute over /T attribute when existed. But for now I think you will need to edit the two radio buttons on the second page to make them a different group.

Screenshot 2024-07-20 105913 Screenshot 2024-07-20 105947