Closed CocoaML closed 6 days ago
The image error is only reported by this message, processing continues: your script is not ended by an exception.
The image error is only reported by this message, processing continues: your script is not ended by an exception.
Thank you for your feedback.
The process was stopped because of an error pdf page. An exception was reported. The error log is:
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/utils.py", line 879, in get_image_rects
pix = pymupdf.Pixmap(page.parent, xref) # make pixmap of the image to compute MD5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/init.py", line 10110, in init
img = mupdf.pdf_load_image(pdf, ref)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/mupdf.py", line 50901, in pdf_load_image
return _mupdf.pdf_load_image(doc, obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pymupdf.mupdf.FzErrorSyntax: code=8: Failed to decode JPX image
Looking forward to further communication with you.
Moving this to "Discussions", as there is no PyMuPDF/MuPDF problem.
Description of the bug
MinerU Error
The Pdf is this: part_4.pdf
How can I skip this error page and continue proceed to another page? Thanks.
How to reproduce the bug
log:
Traceback (most recent call last): File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/runpy.py", line 198, in _run_module_as_main return _run_code(code, main_globals, None, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/runpy.py", line 88, in _run_code exec(code, run_globals) File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 71, in
cli.main()
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="main")
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "/usr/mineru_test/mineru_project/mineru_project_python/minerU/magic_pdf_parse_main.py", line 151, in
pdf_parse_main(pdf_path, output_dir="./out")
File "/usr/mineru_test/mineru_project/mineru_project_python/minerU/magic_pdf_parse_main.py", line 108, in pdf_parse_main
pipe.pipe_classify()
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/pipe/UNIPipe.py", line 25, in pipe_classify
self.pdf_type = AbsPipe.classify(self.pdf_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/pipe/AbsPipe.py", line 63, in classify
pdf_meta = pdf_meta_scan(pdf_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/filter/pdf_meta_scan.py", line 331, in pdf_meta_scan
image_info_per_page, junk_img_bojids = get_image_info(doc, page_width_pts, page_height_pts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/filter/pdf_meta_scan.py", line 119, in get_image_info
page_result = process_image(page, junk_img_bojids)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/filter/pdf_meta_scan.py", line 39, in process_image
recs = page.get_image_rects(img, transform=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/utils.py", line 879, in get_image_rects
pix = pymupdf.Pixmap(page.parent, xref) # make pixmap of the image to compute MD5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/init.py", line 10110, in init
img = mupdf.pdf_load_image(pdf, ref)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/mupdf.py", line 50901, in pdf_load_image
return _mupdf.pdf_load_image(doc, obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pymupdf.mupdf.FzErrorSyntax: code=8: Failed to decode JPX image
How can I skip this error page and continue proceed to another page? Looking forward to your reply,Thanks.
PyMuPDF version
1.24.14
Operating system
Linux
Python version
3.11