KIE : Relation Extraction module in KIE Error : Floating point exception(segmentation dumped)

ChidanandKumarVimaan commented 1 year ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：Ubuntu20.04
版本号/Version：Paddle： PaddleOCR：问题相关组件/Related components：paddlenlp: 2.4.5 paddlepaddle-gpu: 2.3.2.post101 PaddleOCR： latest
运行指令/Command Code：python tools/train.py -c configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml
完整报错/Complete Error Message：Floating point exception(segmentation dumped)

This is Multi-task fine-tuning on XFUND dataset, where we will mix all of the languages and train it on Relation extraction module in KIE part. The reason for using Multi-task fine-tuning is, accuracy of RE F1-score improve greatly when we train it on multiple languages as reported in page no:8 https://arxiv.org/pdf/2104.08836.pdf.

Code crashes because of https://github.com/PaddlePaddle/PaddleNLP/blob/2583b5ab68393545db68fd9631429de206bab270/paddlenlp/transformers/layoutxlm/modeling.py#L1248 all_possible_relations1.shape=0 and all_possible_relations2.shape=0 creating a meshgrid with size(0,0) raise an exception in paddle.meshgrid https://github.com/PaddlePaddle/PaddleNLP/blob/2583b5ab68393545db68fd9631429de206bab270/paddlenlp/transformers/layoutxlm/modeling.py#L1248

resulting in "Floating point exception(segmentation dumped)"

an1018 commented 1 year ago

Is there also the same error when only using the XFUND_zh dataset to train the RE task?

ChidanandKumarVimaan commented 1 year ago

No, These issues are not encountered when using XFUND_zh dataset to train the RE task. Problem comes when mixing Multilingual data( all the 7 languages).

The conversion of each languages according to the format given in the repo using https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/kie/tools/trans_xfun_data.py.

After mixing all the languages, exceptions are thrown when training with multilingual data

ChidanandKumarVimaan commented 1 year ago

@an1018 , kindly help in case if you know any solutions as i have tried by best

an1018 commented 1 year ago

You can try mix two languages(zh and anthor), and see if there is the same error

ChidanandKumarVimaan commented 1 year ago

@an1018 , kindly specify what is anthor. I couldn't get it.

Also in the paper "LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding",using Multi-task fine-tuning is, accuracy of RE F1-score improve greatly when we train it on multiple languages as reported in page no:8 https://arxiv.org/pdf/2104.08836.pdf.

To get that result, i want to replicate the results. Kindly help

an1018 commented 1 year ago

Anthor means any language, we can train with zh and de dataset for example, using two languages help us further locate the problem

ChidanandKumarVimaan commented 1 year ago

Sure will do the experiment and report

chowkamlee81 commented 1 year ago

@an1018 crashes occur only pt and it language. For both of these languages same crashes below

Traceback (most recent call last): File "tools/train.py", line 208, in main(config, device, logger, vdl_writer) File "tools/train.py", line 183, in main amp_level, amp_custom_black_list) File "/home/Desktop/spaceXWkspc/PaddleOCR-release-2.6/tools/program.py", line 290, in train preds = model(batch) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, kwargs) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, *kwargs) File "/home/Desktop/spaceXWkspc/PaddleOCR-release-2.6/ppocr/modeling/architectures/base_model.py", line 86, in forward x = self.backbone(x) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(inputs, kwargs) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, kwargs) File "/home/Desktop/spaceXWkspc/PaddleOCR-release-2.6/ppocr/modeling/backbones/vqa_layoutlm.py", line 237, in forward relations=relations) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, *kwargs) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(inputs, kwargs) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1559, in forward relations) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, *kwargs) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(inputs, **kwargs) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1425, in forward relations, entities = self.build_relation(relations, entities) File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1360, in build_relation all_possible_relations1, all_possible_relations2), File "/home/Desktop/anaconda3/envs/spaceX/lib/python3.7/site-packages/paddle/tensor/creation.py", line 781, in meshgrid out = _C_ops.meshgrid(list(args), num) ValueError: (InvalidArgument) All Inputs of Meshgrid OP are Empty! (at /paddle/paddle/fluid/operators/meshgrid_op.cc:50) [operator < meshgrid > error]

chowkamlee81 commented 1 year ago

@an1018 Kindly suggest

an1018 commented 1 year ago

Sorry for the late reply. You can check the format of pt/it pictures, is there any badcase?

ynjang commented 6 months ago

I resolved a problem by using lowercase to label

PaddlePaddle / PaddleOCR

KIE : Relation Extraction module in KIE Error : Floating point exception(segmentation dumped) #8714