PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.29k stars 7.83k forks source link

ModuleNotFoundError: No module named 'frontend' #13886

Closed wencan closed 1 month ago

wencan commented 1 month ago

🔎 Search before asking

🐛 Bug (问题描述)

wencan@debian12:~/Projects/venv$ ./paddleocr/bin/paddleocr --image_dir "/home/wencan/Downloads/36C25620Q0012-005.pdf" --use_angle_cls true --use_gpu true --lang=en

[2024/09/19 10:22:57] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=True, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='/home/wencan/Downloads/36C25620Q0012-005.pdf', page_num=0, det_algorithm='DB', det_model_dir='/home/wencan/.paddleocr/whl/det/en/en_PP-OCRv3_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/home/wencan/.paddleocr/whl/rec/en/en_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/home/wencan/Projects/venv/paddleocr/lib/python3.11/site-packages/paddleocr/ppocr/utils/en_dict.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=True, cls_model_dir='/home/wencan/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='en', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/09/19 10:22:59] ppocr INFO: **********/home/wencan/Downloads/36C25620Q0012-005.pdf**********
Traceback (most recent call last):
  File "/home/wencan/Projects/venv/paddleocr/lib/python3.11/site-packages/paddle/utils/lazy_import.py", line 32, in try_import
    mod = importlib.import_module(module_name)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/wencan/Projects/venv/paddleocr/lib/python3.11/site-packages/fitz/__init__.py", line 1, in <module>
    from frontend import *
ModuleNotFoundError: No module named 'frontend'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/wencan/Projects/venv/./paddleocr/bin/paddleocr", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/wencan/Projects/venv/paddleocr/lib/python3.11/site-packages/paddleocr/paddleocr.py", line 882, in main
    result = engine.ocr(
             ^^^^^^^^^^^
  File "/home/wencan/Projects/venv/paddleocr/lib/python3.11/site-packages/paddleocr/paddleocr.py", line 707, in ocr
    img, flag_gif, flag_pdf = check_img(img, alpha_color)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wencan/Projects/venv/paddleocr/lib/python3.11/site-packages/paddleocr/paddleocr.py", line 574, in check_img
    img, flag_gif, flag_pdf = check_and_read(image_file)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/wencan/Projects/venv/paddleocr/lib/python3.11/site-packages/paddleocr/ppocr/utils/utility.py", line 123, in check_and_read
    fitz = try_import("fitz")
           ^^^^^^^^^^^^^^^^^^
  File "/home/wencan/Projects/venv/paddleocr/lib/python3.11/site-packages/paddle/utils/lazy_import.py", line 41, in try_import
    raise ImportError(err_msg)
ImportError: Failed importing fitz. This likely means that some paddle modules require additional dependencies that have to be manually installed (usually with `pip install fitz`). 

🏃‍♂️ Environment (运行环境)

Python 3.11.2

OS: Debian GNU/Linux 12 (bookworm) x86_64 
Kernel: 6.1.0-25-amd64 
Uptime: 1 hour, 38 mins 
Packages: 2397 (dpkg), 75 (flatpak), 3 (snap) 
Shell: bash 5.2.15 
Resolution: 2560x1440 
DE: GNOME 43.9 
WM: Mutter 
WM Theme: Adwaita 
Theme: Adwaita [GTK2/3] 
Icons: Adwaita [GTK2/3] 
Terminal: gnome-terminal 
CPU: AMD Ryzen 7 1800X (16) @ 3.600GHz 
GPU: NVIDIA GeForce GTX 1080 Ti 
Memory: 4664MiB / 15896MiB 
Package               Version
--------------------- -----------
anyio                 4.4.0
astor                 0.8.1
beautifulsoup4        4.12.3
certifi               2024.8.30
charset-normalizer    3.3.2
ci-info               0.3.0
click                 8.1.7
configobj             5.0.8
configparser          7.1.0
contourpy             1.3.0
cycler                0.12.1
Cython                3.0.11
decorator             5.1.1
etelemetry            0.3.1
filelock              3.16.1
fire                  0.6.0
fitz                  0.0.1.dev2
fonttools             4.53.1
h11                   0.14.0
httpcore              1.0.5
httplib2              0.22.0
httpx                 0.27.2
idna                  3.10
imageio               2.35.1
imgaug                0.4.0
isodate               0.6.1
kiwisolver            1.4.7
lazy_loader           0.4
lmdb                  1.5.1
looseversion          1.3.0
lxml                  5.3.0
matplotlib            3.9.2
networkx              3.3
nibabel               5.2.1
nipype                1.8.6
numpy                 1.26.4
opencv-contrib-python 4.10.0.84
opencv-python         4.10.0.84
opt-einsum            3.3.0
packaging             24.1
paddleocr             2.8.1
paddlepaddle-gpu      2.6.2
pandas                2.2.2
pathlib               1.0.1
pillow                10.4.0
pip                   23.0.1
protobuf              5.28.2
prov                  2.0.1
pyclipper             1.3.0.post5
pydot                 3.0.1
PyMuPDF               1.24.10
PyMuPDFb              1.24.10
pyparsing             3.1.4
python-dateutil       2.9.0.post0
python-docx           1.1.2
pytz                  2024.2
pyxnat                1.6.2
PyYAML                6.0.2
rapidfuzz             3.9.7
rdflib                6.3.2
requests              2.32.3
scikit-image          0.24.0
scipy                 1.14.1
setuptools            66.1.1
shapely               2.0.6
simplejson            3.19.3
six                   1.16.0
sniffio               1.3.1
soupsieve             2.6
termcolor             2.4.0
tifffile              2024.8.30
tqdm                  4.66.5
traits                6.3.2
typing_extensions     4.12.2
tzdata                2024.1
urllib3               2.2.3

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

python3.11 -m venv paddleocr
./paddleocr/bin/pip install paddlepaddle-gpu
./paddleocr/bin/pip install "paddleocr>=2.0.1"
./paddleocr/bin/pip install fitz pymupdf
./paddleocr/bin/paddleocr --image_dir "/home/wencan/Downloads/36C25620Q0012-005.pdf"  --use_angle_cls true --use_gpu true --lang=en

36C25620Q0012-005.pdf

jingsongliujing commented 1 month ago

建议降低PyMuPDF版本试试pip install PyMuPDF==1.16.14,或者 1.卸载fitz 2.卸载PyMuPDF 3.重装PyMuPDF

wencan commented 1 month ago

卸载重装大法有效 谢谢