Search before asking

[X] I have searched the PaddleOCR Docs and found no similar bug report.
[X] I have searched the PaddleOCR Issues and found no similar bug report.
[X] I have searched the PaddleOCR Discussions and found no similar bug report.

Bug

你好，我在训练自定义数据集，先贴一下我一行的标注文件内容：zh_train_0.jpg [{"transcription":"委托方名称","points":[[225,1119],[528,1119],[528,1181],[225,1181]],"id":1,"label":"wtfmc_key","linking":[[1,2]]},{"transcription":"上海蝶叶电线电缆有限公司","points":[[1130,1119],[1732,1119],[1732,1181],[1130,1181]],"id":2,"label":"wtfmc_value","linking":[[1,2]]},{"transcription":"委托方地址","points":[[225,1254],[524,1254],[524,1316],[225,1316]],"id":3,"label":"wtfdz_key","linking":[[3,4]]},{"transcription":"嘉定区银龙路258弄14号12幢3层","points":[[1041,1287],[1839,1287],[1839,1338],[1041,1338]],"id":4,"label":"wtfdz_value","linking":[[3,4]]},{"transcription":"委托单编号","points":[[225,1386],[517,1386],[517,1448],[225,1448]],"id":5,"label":"wtdbh_key","linking":[[5,6]]},{"transcription":"2020-8047","points":[[1311,1422],[1540,1422],[1540,1473],[1311,1473]],"id":6,"label":"wtdbh_value","linking":[[5,6]]},{"transcription":"样品名称","points":[[222,1517],[461,1517],[461,1580],[222,1580]],"id":7,"label":"ypmc_key","linking":[[7,8]]},{"transcription":"电子天平","points":[[1325,1547],[1536,1547],[1536,1612],[1325,1612]],"id":8,"label":"ypmc_value","linking":[[7,8]]},{"transcription":"型号/规格","points":[[222,1649],[476,1649],[476,1711],[222,1711]],"id":9,"label":"xhgg_key","linking":[[9,10]]},{"transcription":"ES461","points":[[1355,1682],[1514,1682],[1514,1737],[1355,1737]],"id":10,"label":"xhgg_value","linking":[[9,10]]},{"transcription":"制造厂","points":[[225,1781],[395,1781],[395,1835],[225,1835]],"id":11,"label":"zzc_key","linking":[[11,12]]},{"transcription":"HC","points":[[1384,1814],[1469,1814],[1469,1872],[1384,1872]],"id":12,"label":"zzc_value","linking":[[11,12]]},{"transcription":"样品编号","points":[[225,1909],[461,1909],[461,1971],[225,1971]],"id":13,"label":"ypbh_key","linking":[[13,14]]},{"transcription":"/","points":[[1404,1950],[1435,1943],[1446,1987],[1416,1995]],"id":14,"label":"ypbh_value","linking":[[13,14]]},{"transcription":"委托日期","points":[[223,2032],[466,2041],[464,2107],[221,2098]],"id":15,"label":"wtrq_key","linking":[[15,16]]},{"transcription":"2020年08月24日","points":[[1204,2077],[1636,2077],[1636,2128],[1204,2128]],"id":16,"label":"wtrq_value","linking":[[15,16]]}]， class_list_xfun.txt 内容如下： WTFMC_KEY WTFMC_VALUE WTFDZ_KEY WTFDZ_VALUE WTDBH_KEY WTDBH_VALUE YPMC_KEY YPMC_VALUE XHGG_KEY XHGG_VALUE ZZC_KEY ZZC_VALUE YPBH_KEY YPBH_VALUE WTRQ_KEY WTRQ_VALUE 我修改了/home/aistudio/PaddleOCR/configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml 这个配置文件

VQAReTokenChunk: max_seq_len: max_seq_len entities_labels: {"WTFMC_KEY": 1, "WTFMC_VALUE": 2, "WTFDZ_KEY":3, "WTFDZ_VALUE":4, "WTDBH_KEY":5,"WTDBH_VALUE": 6, "YPMC_KEY": 7, "YPMC_VALUE": 8, "XHGG_KEY":9, "XHGG_VALUE":10, "ZZC_KEY":11,"ZZC_VALUE": 12, "YPBH_KEY": 13, "YPBH_VALUE": 14, "WTRQ_KEY":15, "WTRQ_VALUE":16} 加了entities_labels 这个属性，报错信息如下： [2024/08/10 14:00:47] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 19 iterations Traceback (most recent call last): File "/home/aistudio/PaddleOCR/tools/train.py", line 255, in main(config, device, logger, vdl_writer, seed) File "/home/aistudio/PaddleOCR/tools/train.py", line 208, in main program.train( File "/home/aistudio/PaddleOCR/tools/program.py", line 342, in train preds = model(batch) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call return self.forward(inputs, kwargs) File "/home/aistudio/PaddleOCR/ppocr/modeling/architectures/base_model.py", line 85, in forward x = self.backbone(x) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call return self.forward(*inputs, *kwargs) File "/home/aistudio/PaddleOCR/ppocr/modeling/backbones/vqa_layoutlm.py", line 248, in forward x = self.model( File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call return self.forward(inputs, kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1412, in forward loss, pred_relations = self.extractor(sequence_output, entities, relations) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/layers.py", line 1254, in call return self.forward(*inputs, **kwargs) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1304, in forward relations, entities = self.build_relation(relations, entities) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1258, in build_relation positive_relations = paddle.stack([relation_head, relation_tail], axis=1) File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/tensor/manipulation.py", line 1842, in stack return _C_ops.stack(x, axis) ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180)

请问这个是什么bug？

Environment

我是在百度studio训练的。 aiofiles==23.2.1 aiohttp==3.9.5 aiosignal==1.3.1 aistudio-sdk @ file:///home/aistudio/aistudio_sdk-0.2.4-py3-none-any.whl#sha256=d93411cc8764e465860cbf2f97f787dddd1548595d4776c97ddf0ea787dedd81 albucore==0.0.13 albumentations==1.4.10 altair==4.2.2 annotated-types==0.6.0 anyio==4.3.0 astor==0.8.1 asttokens==2.4.1 async-timeout==4.0.3 attrdict3==2.0.2 attrs==23.2.0 Babel==2.14.0 bce-python-sdk==0.9.6 beautifulsoup4==4.12.3 blinker==1.7.0 cachetools==5.3.3 certifi==2024.2.2 charset-normalizer==3.3.2 click==8.1.7 colorama==0.4.6 coloredlogs==15.0.1 colorlog==6.8.2 comm==0.2.2 contourpy==1.2.1 cycler==0.12.1 Cython==3.0.11 datasets==2.19.0 debugpy==1.8.1 decorator==5.1.1 dill==0.3.4 easydict==1.13 entrypoints==0.4 exceptiongroup==1.2.1 executing==2.0.1 fastapi==0.110.2 ffmpy==0.3.2 filelock==3.13.4 fire==0.6.0 Flask==3.0.3 Flask-Babel==2.0.0 flatbuffers==24.3.25 fonttools==4.51.0 frozenlist==1.4.1 fsspec==2024.3.1 future==1.0.0 gitdb==4.0.11 GitPython==3.1.43 gradio==3.40.0 gradio_client==0.15.1 gunicorn==22.0.0 h11==0.14.0 httpcore==1.0.5 httpx==0.27.0 huggingface-hub==0.22.2 humanfriendly==10.0 idna==3.7 imageio==2.34.2 imgaug==0.4.0 importlib_metadata==7.1.0 importlib_resources==6.4.0 ipykernel==6.29.4 ipython==8.23.0 itsdangerous==2.2.0 jedi==0.19.1 jieba==0.42.1 Jinja2==3.1.3 joblib==1.4.0 jsonschema==4.21.1 jsonschema-specifications==2023.12.1 jupyter_client==8.6.1 jupyter_core==5.7.2 kiwisolver==1.4.5 lazy_loader==0.4 linkify-it-py==2.0.3 lmdb==1.5.1 lxml==5.2.2 markdown-it-py==2.2.0 MarkupSafe==2.1.5 matplotlib==3.8.4 matplotlib-inline==0.1.7 mdit-py-plugins==0.3.3 mdurl==0.1.1 mpmath==1.3.0 multidict==6.0.5 multiprocess==0.70.12.2 nest-asyncio==1.6.0 networkx==3.3 numpy==1.26.4 onnx==1.16.0 onnxruntime==1.17.3 opencv-contrib-python==4.10.0.84 opencv-python==4.9.0.80 opencv-python-headless==4.10.0.84 opt-einsum==3.3.0 orjson==3.10.1 packaging==24.0 paddle2onnx==1.2.1 paddlefsl==1.1.0 paddlehub==2.4.0 paddlenlp==2.5.2 paddleocr==2.8.1 paddlepaddle-gpu @ file:///tmp/paddlepaddle_gpu-2.5.2-cp310-cp310-linux_x86_64.whl#sha256=2b4a84c853c7c88ddf4984c667bfcb824cc8a28a674448099452f50c686cc1bb pandas==2.2.2 parso==0.8.4 pexpect==4.9.0 pickleshare==0.7.5 pillow==10.3.0 platformdirs==4.2.0 prettytable==3.10.0 prompt-toolkit==3.0.43 protobuf==3.20.3 psutil==5.9.8 ptyprocess==0.7.0 pure-eval==0.2.2 pyarrow==16.0.0 pyarrow-hotfix==0.6 pybind11==2.12.0 pyclipper==1.3.0.post5 pycryptodome==3.20.0 pydantic==2.7.0 pydantic_core==2.18.1 pydeck==0.9.1 pydub==0.25.1 Pygments==2.17.2 Pympler==1.0.1 pypandoc==1.13 pyparsing==3.1.2 python-dateutil==2.9.0.post0 python-docx==1.1.2 python-multipart==0.0.9 pytz==2024.1 PyYAML==6.0.1 pyzmq==26.0.2 rapidfuzz==3.9.6 rarfile==4.2 referencing==0.34.0 requests==2.31.0 rich==13.7.1 rpds-py==0.18.0 ruff==0.4.1 safetensors==0.4.3 scikit-image==0.24.0 scikit-learn==1.4.2 scipy==1.13.0 semantic-version==2.10.0 semver==3.0.2 sentencepiece==0.2.0 seqeval==1.2.2 shapely==2.0.5 shellingham==1.5.4 six==1.16.0 smmap==5.0.1 sniffio==1.3.1 soupsieve==2.5 stack-data==0.6.3 starlette==0.37.2 streamlit==1.13.0 streamlit-image-comparison==0.0.4 sympy==1.12 termcolor==2.4.0 threadpoolctl==3.4.0 tifffile==2024.7.24 toml==0.10.2 tomli==2.0.1 tomlkit==0.12.0 tool-helpers==0.1.1 toolz==0.12.1 tornado==6.4 tqdm==4.66.2 traitlets==5.14.3 typer==0.12.3 typing_extensions==4.11.0 tzdata==2024.1 tzlocal==5.2 uc-micro-py==1.0.3 urllib3==2.2.1 uvicorn==0.29.0 validators==0.28.3 visualdl==2.4.2 watchdog==4.0.1 wcwidth==0.2.13 websockets==11.0.3 Werkzeug==3.0.2 xxhash==3.4.1 yacs==0.1.8 yarl==1.9.4 zipp==3.19.2

Minimal Reproducible Example

re的配置文件 Global: use_gpu: True epoch_num: &epoch_num 130 log_smooth_window: 10 print_batch_step: 10 save_model_dir: ./output/ccic/re_vi_layoutxlm_xfund_zh save_epoch_step: 2000

evaluation is run every 10 iterations after the 0th iteration

eval_batch_step: [ 0, 19 ] cal_metric_during_train: False save_inference_dir: use_visualdl: False seed: 2022 infer_img: ppstructure/docs/kie/input/zh_val_21.jpg save_res_path: ./output/ccic/re/xfund_zh/with_gt kie_rec_model_dir: kie_det_model_dir:

Architecture: model_type: kie algorithm: &algorithm "LayoutXLM" Transform: Backbone: name: LayoutXLMForRe pretrained: True mode: vi checkpoints:

Loss: name: LossFromOutput key: loss reduction: mean

Optimizer: name: AdamW beta1: 0.9 beta2: 0.999 clip_norm: 10 lr: learning_rate: 0.00005 warmup_epoch: 10 regularizer: name: L2 factor: 0.00000

PostProcess: name: VQAReTokenLayoutLMPostProcess

Metric: name: VQAReTokenMetric main_indicator: hmean

Train: dataset: name: SimpleDataSet data_dir: train_data/0810_8020/zh_train/image label_file_list:

train_data/0810_8020/zh_train/train.json ratio_list: [ 1.0 ] transforms:
DecodeImage: # load image img_mode: RGB channel_first: False
VQATokenLabelEncode: # Class handling label contains_re: True algorithm: *algorithm class_path: &class_path /home/aistudio/PaddleOCR/train_data/0810_8020/class_list_xfun.txt
class_path: /home/aistudio/PaddleOCR/train_data/0810_8020/class_list_xfun.txt
```
  use_textline_bbox_info: &use_textline_bbox_info True
  order_method: &order_method "tb-yx"
```
VQATokenPad: max_seq_len: &max_seq_len 512 return_attention_mask: True
VQAReTokenRelation:
VQAReTokenChunk: max_seq_len: *max_seq_len entities_labels: {"WTFMC_KEY": 1, "WTFMC_VALUE": 2, "WTFDZ_KEY":3, "WTFDZ_VALUE":4, "WTDBH_KEY":5,"WTDBH_VALUE": 6, "YPMC_KEY": 7, "YPMC_VALUE": 8, "XHGG_KEY":9, "XHGG_VALUE":10, "ZZC_KEY":11,"ZZC_VALUE": 12, "YPBH_KEY": 13, "YPBH_VALUE": 14, "WTRQ_KEY":15, "WTRQ_VALUE":16}
TensorizeEntitiesRelations:
Resize: size: [224,224]
NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc'
ToCHWImage:
KeepKeys: keep_keys: [ 'input_ids', 'bbox','attention_mask', 'token_type_ids', 'entities', 'relations'] # dataloader will return list in this order loader: shuffle: True drop_last: False batch_size_per_card: 2 num_workers: 4

Eval: dataset: name: SimpleDataSet data_dir: train_data/0810_8020/zh_val/image label_file_list:

train_data/0810_8020/zh_val/val.json transforms:
DecodeImage: # load image img_mode: RGB channel_first: False
VQATokenLabelEncode: # Class handling label contains_re: True algorithm: algorithm class_path: class_path use_textline_bbox_info: use_textline_bbox_info order_method: order_method
VQATokenPad: max_seq_len: *max_seq_len return_attention_mask: True
VQAReTokenRelation:
VQAReTokenChunk: max_seq_len: *max_seq_len
TensorizeEntitiesRelations:
Resize: size: [224,224]
NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc'
ToCHWImage:
KeepKeys: keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'entities', 'relations'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 8 num_workers: 8

Additional

No response

Are you willing to submit a PR?

[ ] Yes I'd like to help by submitting a PR!

PaddlePaddle / PaddleOCR

自定义数据集训练KIE的RE模型的时候，提示ValueError: (InvalidArgument) x dim number should greater than 0, but received value is: 0 [Hint: Expected x_dim > 0, but received x_dim:0 <= 0:0.] (at ../paddle/phi/backends/gpu/gpu_launch_config.h:180) #13632