smalot / pdfparser

PdfParser, a standalone PHP library, provides various tools to extract data from a PDF file.
GNU Lesser General Public License v3.0
2.42k stars 538 forks source link

Parsing PDF Form data #43

Open geochasm opened 10 years ago

geochasm commented 10 years ago

I am working on parsing a pdf form. The pdfparse functions work - my document is parsed. However, I only receive the text surrounding the form fields. I have entered data in the form fields and saved the form, but I am not able to access the text saved in the form fields.

Can pdfparse access form data?

Thanks

smalot commented 10 years ago

Can you send me your pdf form with data stored into ? sebastien@malot.fr I'll check it.

errmerrged commented 8 years ago

i got the same problem, any anwers yet?

MajorLeeGassole commented 7 years ago

Can we please get an update on this? I'm having the same problem. Sorry if I posted a duplicate. I just found this one today.

135

skys215 commented 4 years ago

Is this problem solved yet?

k00ni commented 4 years ago

It would be helpful if someone could test this matter with current master version of the library. Is it confirmed that #135 is a duplicate? If so, its sample PDF could be used here, right?

nbao commented 1 year ago

Hello, Is there a way to get the list of all name fields? I think every form field should have a field name if we get a field name then maybe we can debug the field data

WeTeKBerlin commented 1 year ago

has there been meanwhile implemented functionality for extracting form data?

k00ni commented 1 year ago

No, not to my knowledge.

roboparker commented 3 months ago

The data is already there but some fields are not parsing correctly. Particularly the values.

I grabbed the form fields by getting all elements with the FT (field type) header. For some forms, the V (value) header are incorrectly parsed as false.

If I figure it out, I can make a PR for this.

k00ni commented 3 months ago

If I figure it out, I can make a PR for this.

:+1: Looking forward to it.