J-F-Liu / lopdf

A Rust library for PDF document manipulation.
MIT License
1.67k stars 176 forks source link

How to extract images from a PDF? #278

Open liubaochuan opened 6 months ago

liubaochuan commented 6 months ago

How to extract images from a PDF when get_page_images doesn't work.

gamcoh commented 2 months ago

Did you find any help @liubaochuan ?

Heinenen commented 2 weeks ago

The relevant part of the spec is "8.9 Images". There seem to be two ways to embed an image: as an XObject and as an inline image (8.9.7 Inline Images). Inline images are embedded in the content stream of the page. I'm pretty sure that lopdf does not find such images.

A sample PDF would really help to confirm my suspicion (or help fix a bug).

Heinenen commented 1 week ago

Maybe related to #78.