Closed ld520 closed 2 weeks ago
系统环境/System Environment:win10 22H2 / Python3.9 版本号/Version:Paddle:2.4.2 PaddleOCR:2.6 问题相关组件/Related components:ppstructure 运行指令/Command Code:python -m test 完整报错/Complete Error Message:如上原文
aistudio上建个项目试试,不清楚你的情况不好复现。
就是官方的动手学OCR·十讲 中的文档分析实战-表格识别 项目
将html分割并合成excel这一步错误,我试图单独print pred_html 但是没有识别出图像
您解决了吗?我也遇到了相同的问题,我尝试追溯,感觉是predict_table中的self.match出了问题
该issue长时间未更新,暂将此issue关闭,如有需要可重新开启。
使用文档分析实战-表格识别项目时,运行 代码报错 代码:
import cv2 from table.predict_table import TableSystem,to_excel from utility import init_args
初始化参数
args = init_args().parse_args(args=[]) args.det_model_dir='inference/ch_PP-OCRv2_det_infer' args.rec_model_dir='inference/ch_PP-OCRv2_rec_infer' args.table_model_dir='inference/en_ppocr_mobile_v2.0_table_structure_infer' args.image_dir='/home/aistudio/1.jpg' args.rec_char_dict_path='../ppocr/utils/ppocr_keys_v1.txt' args.table_char_dict_path='../ppocr/utils/dict/table_structure_dict.txt' args.det_limit_side_len=736 args.det_limit_type='min' args.output='../output/table' args.use_gpu=False
初始化表格识别系统
table_sys = TableSystem(args) img = cv2.imread('/home/aistudio/1.jpg')
执行表格识别
pred_html = table_sys(img)
结果存储到excel文件
to_excel(pred_html,'1.xlsx') print(pred_html)
报错: ddle120-env/lib/python3.7/site-packages/matplotlib/pyplot.py", line 533, in figure **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/backend_bases.py", line 161, in new_figure_manager return cls.new_figure_manager_given_figure(num, fig) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/backends/_backend_tk.py", line 1046, in new_figure_manager_given_figure window = Tk.Tk(className="matplotlib") File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/tkinter/init.py", line 2023, in init self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use) _tkinter.TclError: no display name and no $DISPLAY environment variable aistudio@jupyter-4490434-6530013:~/work/PaddleOCR-release-2.6/ppstructure$ python -m test [2023/07/14 09:15:46] ppocr DEBUG: dt_boxes num : 69, elapse : 0.8896045684814453
[2023/07/14 09:15:52] ppocr DEBUG: rec_res num : 69, elapse : 5.328573226928711 Traceback (most recent call last): File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/aistudio/work/PaddleOCR-release-2.6/ppstructure/test.py", line 24, in
to_excel(pred_html,'1.xlsx')
File "/home/aistudio/work/PaddleOCR-release-2.6/ppstructure/table/predict_table.py", line 145, in to_excel
tablepyxl.document_to_xl(html_table, excel_path)
File "/home/aistudio/work/PaddleOCR-release-2.6/ppstructure/table/tablepyxl/tablepyxl.py", line 101, in document_to_xl
wb = document_to_workbook(doc, base_url=base_url)
File "/home/aistudio/work/PaddleOCR-release-2.6/ppstructure/table/tablepyxl/tablepyxl.py", line 87, in document_to_workbook
inline_styles_doc = Premailer(doc, base_url=base_url, remove_classes=False).transform()
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/premailer/premailer.py", line 319, in transform
stripped = html.strip()
AttributeError: 'tuple' object has no attribute 'strip'