smalot / pdfparser

PdfParser, a standalone PHP library, provides various tools to extract data from a PDF file.
GNU Lesser General Public License v3.0
2.41k stars 537 forks source link

Extracting graphics from a PDF #747

Open Himeos opened 1 week ago

Himeos commented 1 week ago

Is it possible to extract graphics like this one from a PDF? Images work just fine, I can't seem to figure out how to extract this. Upon printing the type of objects from this PDF, all I get is: Smalot\PdfParser\PDFObject, Smalot\PdfParser\XObject\Form, Smalot\PdfParser\Font\FontType1 and Smalot\PdfParser\Encoding.

I'm assuming this was created using LaTeX but I'm not sure.

Screenshot 2024-11-14 150503

k00ni commented 1 week ago

There are some issues regarding images (e.g. how to extract them), such as https://github.com/smalot/pdfparser/issues/705#issuecomment-2106566931. Maybe they already provide what you are looking for.