gettalong / hexapdf

Versatile PDF creation and manipulation for Ruby
https://hexapdf.gettalong.org
Other
1.21k stars 69 forks source link

Form field looses font type after flatten #288

Closed thomasbaustert closed 6 months ago

thomasbaustert commented 6 months ago

We are using hexapdf to fill out PDF template. The form contains text fields with font Times New Roman.

When using iText the generated PDF contains the inserted text in Times New Roman. See itext.pdf

When using hexapdf with form fields kept and set to readonly the generated PDF contains the inserted text in Times New Roman. At least it looks like it. See hexapdf.pdf

When using hexapdf with form flatten (doc.acro_form.flatten(create_appearances: true)) the generated PDF contains the inserted text in a different font(!?)

See hexapdf_flatten.pdf

See textfield.pdf for the test template.

I could use the unflatten version but the PDF template also contains checkboxes and radiobuttons. And unfortunately the radiobuttons can still be edited (changed), at least in Mac Preview(!?)

Best would be to use a non editable PDF. I think, to flatten the form is how it works with hexapdf. But unfortunately the fonts is not kept.

What can we do?

Thanks!

hexapdf_flatten.pdf hexapdf.pdf itext.pdf textfield.pdf

thomasbaustert commented 6 months ago

here is a minimal test script: https://gist.github.com/thomasbaustert/e6c6ff27e6b965abdd8204b7ce2c292d

gettalong commented 6 months ago

So if I'm reading correctly what you need to do is using the 'acro_form.fallback_font' to ensure HexaPDF can work with the correct font.

thomasbaustert commented 6 months ago

I added the font Times New Roman and set it as fallback as follows:

doc = HexaPDF::Document.new(...)
doc.config['font.map'] = {
  'Times New Roman' => { none: font_file('times-new-roman.ttf') }
}
doc.config['acro_form.fallback_font'] = ['Times New Roman', { variant: :none }]
...

The resulting PDF is fine. It embeds the font and when open in a reader the label "Bootstyp" and the text "Bootstyp" are displayed in the same font (Times New Roman). Looks like I have a solution.

But I wonder the following: When I open the hexapdf_flatten.pdf in Acrobat Reader for example, the label "Bootstyp" is display in Times New Roman and the text "Bootstyp" is displayed in Helvetica. This is because hexapdf "changes the font definition" of the text field to Helvetica (fallback), because it cannot use Times New Roman, because the font is not embedded, right?

But the font for the label "Bootstyp" will not be changed. Wouldn't this also have to be changed to Helvetica to be consistent?

It is necessary to "change the font definition" for filled text fields at all? Is it not possible to keep Times New Roman, even it is not embedded in the PDF? And let the reader decide how to display it, but label and text in same font.

gettalong commented 6 months ago

Great that this works for you!

As for your questions: