py-pdf / pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
https://pypdf.readthedocs.io/en/latest/
Other
7.72k stars 1.36k forks source link

Form Fill Font Size and Orientation wrong #2731

Open orgmast5 opened 5 days ago

orgmast5 commented 5 days ago

I'm trying to fill a form, I installed the latest build from source main tree (pip install git+https://github.com/py-pdf/pypdf.git@main) and this form is causing issues with the text being rotated (in some viewers spaced out) and on the wrong side. I think it is related to these recent issues #2636 and #2724 It is fixed when I set auto_regenerate=True and open and save again with Acrobat Reader but that's not what's intended in the docs.(https://pypdf.readthedocs.io/en/stable/user/forms.html#filling-out-forms)

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfReader, PdfWriter

reader = PdfReader("template.pdf")
writer = PdfWriter()

writer.append(reader)

writer.update_page_form_field_values(
    writer.pages[0], 
    {"Stellenbezeichnung_1": "some filled in text"},
    auto_regenerate=False
)

with open("filled-out.pdf", "wb") as output_stream:
    writer.write(output_stream)

I attached the template pdf and the filled-out pdf, you are free to use them in your tests. template.pdf filled-out.pdf

Chromium screenshot: Screenshot_20240628_222604 Firefox screenshot: Screenshot_20240628_222639 evince pdf reader: Screenshot_20240628_222832

I tried messing with the annotation widgets and setting my own Font and Rect but I couldn't make that work. I did look at the bit mask and it is a multiline + file select field. When i saw the multiline thing and #2636 code changes adding a "DEFAULT_FONT_HEIGHT_IN_MULTILINE = 12" I thought maybe the nightly version would be better than 4.20 release. The text is indeed smaller but it still is rotated 90 degrees clockwise.

pubpub-zz commented 4 days ago

The error of orientation is due to /Matrix entry missing This is a bug to be fix

pubpub-zz commented 1 day ago

new issue created about the problem of extra \x00 due to mix up between utf8 and 8bit only charsets @orgmast5 created:

I'm trying to fill a form, I installed the latest build from source main tree (pip install git+https://github.com/py-pdf/pypdf.git@main) and this form is causing issues with the text being rotated (in some viewers spaced out) and on the wrong side. I think it is related to these recent issues #2636 and #2724 It is fixed when I set auto_regenerate=True and open and save again with Acrobat Reader but that's not what's intended in the docs.(https://pypdf.readthedocs.io/en/stable/user/forms.html#filling-out-forms)

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfReader, PdfWriter

reader = PdfReader("template.pdf")
writer = PdfWriter()

writer.append(reader)

writer.update_page_form_field_values(
    writer.pages[0], 
    {"Stellenbezeichnung_1": "some filled in text"},
    auto_regenerate=False
)

with open("filled-out.pdf", "wb") as output_stream:
    writer.write(output_stream)

I attached the template pdf and the filled-out pdf, you are free to use them in your tests. template.pdf filled-out.pdf

Chromium screenshot: Screenshot_20240628_222604 Firefox screenshot: Screenshot_20240628_222639 evince pdf reader: Screenshot_20240628_222832

I tried messing with the annotation widgets and setting my own Font and Rect but I couldn't make that work. I did look at the bit mask and it is a multiline + file select field. When i saw the multiline thing and #2636 code changes adding a "DEFAULT_FONT_HEIGHT_IN_MULTILINE = 12" I thought maybe the nightly version would be better than 4.20 release. The text is indeed smaller but it still is rotated 90 degrees clockwise.