PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.09k stars 2.93k forks source link

[Question]: 在使用UIE-X 进行实体抽取的时候,输入自己的图片会报错ValueError #8831

Open JackHe1999 opened 3 months ago

JackHe1999 commented 3 months ago

问题

官方实例可以跑通,但使用自己的图片测试的时候就跑不通

环境

macbookpro intel cpu paddlenlp 2.6.1 paddleocr 2.8.1 paddlepaddle 2.6.1

报错信息

(test) (base) ➜  test python test.py
[2024/08/07 14:23:56] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir=None, page_num=0, det_algorithm='DB', det_model_dir='/Users/admin/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/Users/admin/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/Users/admin/miniconda3/envs/test/lib/python3.10/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=True, cls_model_dir='/Users/admin/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/08/07 14:23:57] ppocr DEBUG: dt_boxes num : 25, elapsed : 0.5215017795562744
[2024/08/07 14:23:57] ppocr DEBUG: cls num  : 25, elapsed : 0.18526315689086914
[2024/08/07 14:24:03] ppocr DEBUG: rec_res num  : 25, elapsed : 6.003919839859009
E0807 14:24:07.350140 1256617152 analysis_config.cc:653] Please compile with MKLDNN first to use MKLDNN
[2024-08-07 14:24:09,855] [    INFO] - We are using <class 'paddlenlp.transformers.ernie_layout.tokenizer.ErnieLayoutTokenizer'> to load '/Users/admin/.paddlenlp/taskflow/information_extraction/uie-x-base'.
Traceback (most recent call last):
  File "/Users/admin/Desktop/pythonProjects/test/test.py", line 27, in <module>
    ie_result = ie_task({"doc": you_img_path, "layout": ocr_layout})
  File "/Users/admin/miniconda3/envs/test/lib/python3.10/site-packages/paddlenlp/taskflow/taskflow.py", line 817, in __call__
    results = self.task_instance(inputs, **kwargs)
  File "/Users/admin/miniconda3/envs/test/lib/python3.10/site-packages/paddlenlp/taskflow/task.py", line 527, in __call__
    outputs = self._run_model(inputs, **kwargs)
  File "/Users/admin/miniconda3/envs/test/lib/python3.10/site-packages/paddlenlp/taskflow/information_extraction.py", line 1068, in _run_model
    results = self._multi_stage_predict(_inputs)
  File "/Users/admin/miniconda3/envs/test/lib/python3.10/site-packages/paddlenlp/taskflow/information_extraction.py", line 1166, in _multi_stage_predict
    result_list = self._single_stage_predict(examples)
  File "/Users/admin/miniconda3/envs/test/lib/python3.10/site-packages/paddlenlp/taskflow/information_extraction.py", line 979, in _single_stage_predict
    self.predictor.run()
ValueError: (InvalidArgument) Variable value (input) of OP(fluid.layers.embedding) expected >= 0 and < 1024, but got 1049. Please check input value.
  [Hint: Expected ids[i] < row_number, but received ids[i]:1049 >= row_number:1024.] (at /Users/paddle/xly/workspace/9a389d1e-5f81-4294-a204-ca0214ddf827/Paddle/paddle/phi/kernels/cpu/embedding_kernel.cc:61)
  [operator < lookup_table_v2 > error]

代码

from paddleocr import PaddleOCR
from paddlenlp import Taskflow

you_img_path = "./data/goods3.jpeg"
you_schema = [ "价格", "品牌", "型号", "内存容量", "成色" ]

# ocr_version对应模型设置
ocr = PaddleOCR(use_angle_cls=True, lang="ch", ocr_version="PP-OCRv4")

# ocr识别
ocr_result = ocr.ocr(you_img_path, rec=True)

# ocr结果组成layout参数
ocr_layout = []
for res in ocr_result:
    for item in res:
        x1, y1 = item[0][0]
        x2, y2 = item[0][2]
        text = item[1][0]
        ocr_layout.append(([x1, y1, x2, y2], text))

# print(ocr_layout)

ie_task = Taskflow("information_extraction", schema=you_schema, model="uie-x-base", layout_analysis=True)

# uie模型预测
ie_result = ie_task({"doc": you_img_path, "layout": ocr_layout})

print(ie_result)

输入图片

goods3

tianchiguaixia commented 1 month ago

直接换layoutlmv3 稳定,效果好。 1 (5)