Open corymickelson opened 5 years ago
Hey @corymickelson Do you have any update on this feature? I'd be really happy to contribute if you could point me in the right direction
Hey Matthew, I just got back from vacation. Any help is appreciated, let me see where I left off with flattening fields, it's been a couple months since I was working on this. Will get back to you soon.
On Wed, Oct 9, 2019 at 1:04 PM Matthew Markgraaff notifications@github.com wrote:
Hey @corymickelson https://github.com/corymickelson Do you have any update on this feature? I'd be really happy to contribute if you could point me in the right direction
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/corymickelson/NoPoDoFo/issues/83?email_source=notifications&email_token=AB7QWDXXD3XUGOLERH27WFTQNY2OBA5CNFSM4GTTWHBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAZE4YQ#issuecomment-540167778, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7QWDQ53HCDVS5TMZTWMATQNY2OBANCNFSM4GTTWHBA .
Hey Cory Great, look forward to your response
@MatthewMarkgraaff Thanks again, this is a somewhat larger discussion, but hopefully the following will help get you started. First, let's make sure we understand what flattening is in relation to PDF. When we flatten a PDF document we are writing (copying) the appearance stream from each acroform field on a page and saving it to an xobject (see below for xobject definition). Once all interested field appearance streams have been written to an xobject the next steps are to apply/paint the xobject to the pdf page and delete the fields from the acroform dictionary. This in effect is adding the visualization of the form fields to the page, while removing any interactive properties of the field.
A form XObject is a PDF content stream that is a self-contained description of any sequence of graphics objects (including path objects, text objects, and sampled images). A form XObject may be painted multiple times—either on several pages or at several locations on the same page—and produces the same results each time, subject only to the graphics state at the time it is invoked. Not only is this shared definition economical to represent in the PDF file, but under suitable circumstances the PDF consumer application can optimize execution by caching the results of rendering the form XObject for repeated reuse.
In order to begin this process we must first ensure each acroform field has an appearance stream (the field dictionary property key is AP) as well as a value and/or default value (keyed as V, and DV). This presents it’s own unique set of circumstances as form fields are not required to store an AP stream, form fields can use an acroform property NeedAppearances (keyed as DA on the acroform dictionary) to fallback to a default appearance of 12pt black arial. This while inconvenient, is not a difficult issue to resolve, we simply must be aware of it so we know where to look when the AP property is non-existent. The next step would be to efficiently store each appearance stream in an xobject which would subsequently be applied to the PDF page.
To begin I suggest you look over the sdk/FlattenFields class, and pull the cpp PDFium source code. The FlattenFields class, given a page, iterates the fields on the page, creates an xobject to accommodate these fields, writes the xobject to the page and deletes the fields from the acroform dictionary. This class currently only works for text form fields that contain field level AP,D, and DV properties. This work is in the very early stages, and is based off the flatten functionality of PDFium. There is very little documentation on how to programmatically flatten PDF’s, the knowledge base you will need in order to contribute to this work will be largely based off reading the source code of programs that provide this functionality. I realize this is a lot of work, and very much appreciate any contributions to this effort. Please feel free to reach out with any questions, I will do my best to answer and guide you in your efforts to expand the functionality of this library.
Hey @corymickelson
Couldn't ask for a better starting point, thank you for the detailed intro. Conceptually, I get it. I'll set aside some time this weekend to get started and will keep you posted here.
@MatthewMarkgraaff Have you been able to implement this?
Add the ability for a user to simply invoke a method
Form.flatten()
to flatten all fields in a forms fields array.