pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
https://pymupdf.readthedocs.io
GNU Affero General Public License v3.0
5.78k stars 534 forks source link

I can't get the document errors correctly #4020

Closed kjlsdlkfio332fsdkla32skjdf closed 3 weeks ago

kjlsdlkfio332fsdkla32skjdf commented 3 weeks ago

Description of the bug

I use the TOOLS code to get a warning when I open a document. The fact is that I need to get all the structural errors of the document in this way (xref table, incorrect stream), etc. But for some reason, errors don't always work correctly. Please tell me, can I somehow get all the structural errors that it contains from the document class? or what solution should I use for this? image image how can I conveniently get errors for specific files?

How to reproduce the bug

When I try to get errors in this way from several files at once, simultaneously clearing the buffer - it sometimes accumulates, and sometimes even works incorrectly (without displaying a list of errors in principle).

def check_struct(file_path): pymupdf.TOOLS.mupdf_display_warnings(False) pymupdf.TOOLS.mupdf_display_errors(False) document = pymupdf.open(file_path) document.close() result = pymupdf.TOOLS.mupdf_warnings(True) return result

print(check_struct("1.pdf"))

PyMuPDF version

1.24.13

Operating system

Windows

Python version

3.11

JorjMcKie commented 3 weeks ago

This is a typical Discussions item - transferring ...