gettalong / hexapdf

Versatile PDF creation and manipulation for Ruby
https://hexapdf.gettalong.org
Other
1.21k stars 69 forks source link

Number-formatting of text-field gets lost when flattening acro_form #220

Closed andi-dev closed 1 year ago

andi-dev commented 1 year ago

Hi again,

In a pdf containing a text field with number formatting, when flattening the form the formatting seems to be lost.

Original pdf:

image

Resulting pdf:

image

Screenshot from Acrobat:

image

I did not yet have time to look into the code and see how feasible it even is to persist the formatting when re-creating the fields appearance, but maybe that is something you already thought about?

Cheers!

ps: let me know if the original pdf would be helpful.

gettalong commented 1 year ago

It would be great, if you could send me the PDF because then I can determine whether this is some kind of Javascript code doing the formatting or something else. Thanks!

gettalong commented 1 year ago

@andi-dev I did some digging and only found solutions that involve Javascript. If you could provide the PDF, I could verify that this is also the case in your situation.

FYI: HexaPDF doesn't include a Javascript engine, so if this is Javascript based, it won't work.

andi-dev commented 1 year ago

Hi Thomas, sorry for the late reply. I just sent you the pdf.

gettalong commented 1 year ago

Thanks Andreas! I can confirm that this the PDF is also using Javascript to format the numbers. It seems that Adobe defined a Javascript method AFNumber_Format which is used in all the cases I have looked at so far. I will see if I can find a reference documentation for that method. It may be possible to just parse the Javascript method arguments and synthesize the needed result.

gettalong commented 1 year ago

Okay, I found https://experienceleague.adobe.com/docs/experience-manager-learn/assets/FormsAPIReference.pdf?lang=en but it doesn't seem to accurately reflect things as sepStyle is 2 in the PDF you provided and that is not listed there.

However, I think I can come up with something to make this work even without proper Javascript support, provided the PDF file follows the same scheme that I have now seen multiple times.

gettalong commented 1 year ago

@andi-dev I now have a draft implementation of the AFNumber_Format method and it works, at least for your supplied PDF (the third number was filled-in and rendered by HexaPDF):

image

What you might also have noticed on the above screenshot is that the rendered form field uses the correct font! I saw that the font used for the form field was fully embedded and not a subset and enhanced HexaPDF to use this font. This won't work in all cases but it should provide a better out-of-the-box experience for many more PDFs with interactive forms.

gettalong commented 1 year ago

Will be in the upcoming release.