PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.18k stars 7.81k forks source link

自定义数据集训练KIE的RE模型的时候,提示ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180) #13632

Open freezehe opened 3 months ago

freezehe commented 3 months ago

Search before asking

Bug

你好,我在训练自定义数据集,先贴一下我一行的标注文件内容:zh_train_0.jpg [{"transcription":"委托方名称","points":[[225,1119],[528,1119],[528,1181],[225,1181]],"id":1,"label":"wtfmc_key","linking":[[1,2]]},{"transcription":"上海蝶叶电线电缆有限公司","points":[[1130,1119],[1732,1119],[1732,1181],[1130,1181]],"id":2,"label":"wtfmc_value","linking":[[1,2]]},{"transcription":"委托方地址","points":[[225,1254],[524,1254],[524,1316],[225,1316]],"id":3,"label":"wtfdz_key","linking":[[3,4]]},{"transcription":"嘉定区银龙路258弄14号12幢3层","points":[[1041,1287],[1839,1287],[1839,1338],[1041,1338]],"id":4,"label":"wtfdz_value","linking":[[3,4]]},{"transcription":"委托单编号","points":[[225,1386],[517,1386],[517,1448],[225,1448]],"id":5,"label":"wtdbh_key","linking":[[5,6]]},{"transcription":"2020-8047","points":[[1311,1422],[1540,1422],[1540,1473],[1311,1473]],"id":6,"label":"wtdbh_value","linking":[[5,6]]},{"transcription":"样品名称","points":[[222,1517],[461,1517],[461,1580],[222,1580]],"id":7,"label":"ypmc_key","linking":[[7,8]]},{"transcription":"电子天平","points":[[1325,1547],[1536,1547],[1536,1612],[1325,1612]],"id":8,"label":"ypmc_value","linking":[[7,8]]},{"transcription":"型号/规格","points":[[222,1649],[476,1649],[476,1711],[222,1711]],"id":9,"label":"xhgg_key","linking":[[9,10]]},{"transcription":"ES461","points":[[1355,1682],[1514,1682],[1514,1737],[1355,1737]],"id":10,"label":"xhgg_value","linking":[[9,10]]},{"transcription":"制造厂","points":[[225,1781],[395,1781],[395,1835],[225,1835]],"id":11,"label":"zzc_key","linking":[[11,12]]},{"transcription":"HC","points":[[1384,1814],[1469,1814],[1469,1872],[1384,1872]],"id":12,"label":"zzc_value","linking":[[11,12]]},{"transcription":"样品编号","points":[[225,1909],[461,1909],[461,1971],[225,1971]],"id":13,"label":"ypbh_key","linking":[[13,14]]},{"transcription":"/","points":[[1404,1950],[1435,1943],[1446,1987],[1416,1995]],"id":14,"label":"ypbh_value","linking":[[13,14]]},{"transcription":"委托日期","points":[[223,2032],[466,2041],[464,2107],[221,2098]],"id":15,"label":"wtrq_key","linking":[[15,16]]},{"transcription":"2020年08月24日","points":[[1204,2077],[1636,2077],[1636,2128],[1204,2128]],"id":16,"label":"wtrq_value","linking":[[15,16]]}], class_list_xfun.txt 内容如下: WTFMC_KEY WTFMC_VALUE WTFDZ_KEY WTFDZ_VALUE WTDBH_KEY WTDBH_VALUE YPMC_KEY YPMC_VALUE XHGG_KEY XHGG_VALUE ZZC_KEY ZZC_VALUE YPBH_KEY YPBH_VALUE WTRQ_KEY WTRQ_VALUE 我修改了/home/aistudio/PaddleOCR/configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml 这个配置文件

请问这个是什么bug?

Environment

我是在百度studio训练的。 aiofiles==23.2.1 aiohttp==3.9.5 aiosignal==1.3.1 aistudio-sdk @ file:///home/aistudio/aistudio_sdk-0.2.4-py3-none-any.whl#sha256=d93411cc8764e465860cbf2f97f787dddd1548595d4776c97ddf0ea787dedd81 albucore==0.0.13 albumentations==1.4.10 altair==4.2.2 annotated-types==0.6.0 anyio==4.3.0 astor==0.8.1 asttokens==2.4.1 async-timeout==4.0.3 attrdict3==2.0.2 attrs==23.2.0 Babel==2.14.0 bce-python-sdk==0.9.6 beautifulsoup4==4.12.3 blinker==1.7.0 cachetools==5.3.3 certifi==2024.2.2 charset-normalizer==3.3.2 click==8.1.7 colorama==0.4.6 coloredlogs==15.0.1 colorlog==6.8.2 comm==0.2.2 contourpy==1.2.1 cycler==0.12.1 Cython==3.0.11 datasets==2.19.0 debugpy==1.8.1 decorator==5.1.1 dill==0.3.4 easydict==1.13 entrypoints==0.4 exceptiongroup==1.2.1 executing==2.0.1 fastapi==0.110.2 ffmpy==0.3.2 filelock==3.13.4 fire==0.6.0 Flask==3.0.3 Flask-Babel==2.0.0 flatbuffers==24.3.25 fonttools==4.51.0 frozenlist==1.4.1 fsspec==2024.3.1 future==1.0.0 gitdb==4.0.11 GitPython==3.1.43 gradio==3.40.0 gradio_client==0.15.1 gunicorn==22.0.0 h11==0.14.0 httpcore==1.0.5 httpx==0.27.0 huggingface-hub==0.22.2 humanfriendly==10.0 idna==3.7 imageio==2.34.2 imgaug==0.4.0 importlib_metadata==7.1.0 importlib_resources==6.4.0 ipykernel==6.29.4 ipython==8.23.0 itsdangerous==2.2.0 jedi==0.19.1 jieba==0.42.1 Jinja2==3.1.3 joblib==1.4.0 jsonschema==4.21.1 jsonschema-specifications==2023.12.1 jupyter_client==8.6.1 jupyter_core==5.7.2 kiwisolver==1.4.5 lazy_loader==0.4 linkify-it-py==2.0.3 lmdb==1.5.1 lxml==5.2.2 markdown-it-py==2.2.0 MarkupSafe==2.1.5 matplotlib==3.8.4 matplotlib-inline==0.1.7 mdit-py-plugins==0.3.3 mdurl==0.1.1 mpmath==1.3.0 multidict==6.0.5 multiprocess==0.70.12.2 nest-asyncio==1.6.0 networkx==3.3 numpy==1.26.4 onnx==1.16.0 onnxruntime==1.17.3 opencv-contrib-python==4.10.0.84 opencv-python==4.9.0.80 opencv-python-headless==4.10.0.84 opt-einsum==3.3.0 orjson==3.10.1 packaging==24.0 paddle2onnx==1.2.1 paddlefsl==1.1.0 paddlehub==2.4.0 paddlenlp==2.5.2 paddleocr==2.8.1 paddlepaddle-gpu @ file:///tmp/paddlepaddle_gpu-2.5.2-cp310-cp310-linux_x86_64.whl#sha256=2b4a84c853c7c88ddf4984c667bfcb824cc8a28a674448099452f50c686cc1bb pandas==2.2.2 parso==0.8.4 pexpect==4.9.0 pickleshare==0.7.5 pillow==10.3.0 platformdirs==4.2.0 prettytable==3.10.0 prompt-toolkit==3.0.43 protobuf==3.20.3 psutil==5.9.8 ptyprocess==0.7.0 pure-eval==0.2.2 pyarrow==16.0.0 pyarrow-hotfix==0.6 pybind11==2.12.0 pyclipper==1.3.0.post5 pycryptodome==3.20.0 pydantic==2.7.0 pydantic_core==2.18.1 pydeck==0.9.1 pydub==0.25.1 Pygments==2.17.2 Pympler==1.0.1 pypandoc==1.13 pyparsing==3.1.2 python-dateutil==2.9.0.post0 python-docx==1.1.2 python-multipart==0.0.9 pytz==2024.1 PyYAML==6.0.1 pyzmq==26.0.2 rapidfuzz==3.9.6 rarfile==4.2 referencing==0.34.0 requests==2.31.0 rich==13.7.1 rpds-py==0.18.0 ruff==0.4.1 safetensors==0.4.3 scikit-image==0.24.0 scikit-learn==1.4.2 scipy==1.13.0 semantic-version==2.10.0 semver==3.0.2 sentencepiece==0.2.0 seqeval==1.2.2 shapely==2.0.5 shellingham==1.5.4 six==1.16.0 smmap==5.0.1 sniffio==1.3.1 soupsieve==2.5 stack-data==0.6.3 starlette==0.37.2 streamlit==1.13.0 streamlit-image-comparison==0.0.4 sympy==1.12 termcolor==2.4.0 threadpoolctl==3.4.0 tifffile==2024.7.24 toml==0.10.2 tomli==2.0.1 tomlkit==0.12.0 tool-helpers==0.1.1 toolz==0.12.1 tornado==6.4 tqdm==4.66.2 traitlets==5.14.3 typer==0.12.3 typing_extensions==4.11.0 tzdata==2024.1 tzlocal==5.2 uc-micro-py==1.0.3 urllib3==2.2.1 uvicorn==0.29.0 validators==0.28.3 visualdl==2.4.2 watchdog==4.0.1 wcwidth==0.2.13 websockets==11.0.3 Werkzeug==3.0.2 xxhash==3.4.1 yacs==0.1.8 yarl==1.9.4 zipp==3.19.2

Minimal Reproducible Example

re的配置文件 Global: use_gpu: True epoch_num: &epoch_num 130 log_smooth_window: 10 print_batch_step: 10 save_model_dir: ./output/ccic/re_vi_layoutxlm_xfund_zh save_epoch_step: 2000

evaluation is run every 10 iterations after the 0th iteration

eval_batch_step: [ 0, 19 ] cal_metric_during_train: False save_inference_dir: use_visualdl: False seed: 2022 infer_img: ppstructure/docs/kie/input/zh_val_21.jpg save_res_path: ./output/ccic/re/xfund_zh/with_gt kie_rec_model_dir: kie_det_model_dir:

Architecture: model_type: kie algorithm: &algorithm "LayoutXLM" Transform: Backbone: name: LayoutXLMForRe pretrained: True mode: vi checkpoints:

Loss: name: LossFromOutput key: loss reduction: mean

Optimizer: name: AdamW beta1: 0.9 beta2: 0.999 clip_norm: 10 lr: learning_rate: 0.00005 warmup_epoch: 10 regularizer: name: L2 factor: 0.00000

PostProcess: name: VQAReTokenLayoutLMPostProcess

Metric: name: VQAReTokenMetric main_indicator: hmean

Train: dataset: name: SimpleDataSet data_dir: train_data/0810_8020/zh_train/image label_file_list:

Eval: dataset: name: SimpleDataSet data_dir: train_data/0810_8020/zh_val/image label_file_list:

Additional

No response

Are you willing to submit a PR?

github-actions[bot] commented 3 days ago

This issue is stale because it has been open for 90 days with no activity.