Images missing from TextPage dictionary

pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

GNU Affero General Public License v3.0

5.8k stars 536 forks source link

Description of the bug

Since version 1.21.1 images are not present in the TextPage dictionary. So the examnple in section 2 in the docs does not work: https://pymupdf.readthedocs.io/en/latest/recipes-images.html#how-to-extract-images-non-pdf-documents

How to reproduce the bug

This code snippet does not work, no images are present in any pdf files.

d = page.get_text("dict") blocks = d["blocks"] # the list of block dictionaries imgblocks = [b for b in blocks if b["type"] == 1] pprint(imgblocks[0])

PyMuPDF version

1.24.2

Operating system

Windows

Python version

3.8