PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.1k stars 7.81k forks source link

表格分析训练时更改max_len=1024参数值报错 #3700

Closed 965662766 closed 2 years ago

965662766 commented 3 years ago

image

D:\WordInstaller\Anaconda3\envs\PaddleOCR\python.exe "D:\WordInstaller\PyCharm 2020.1.3\plugins\python\helpers\pydev\pydevd.py" --multiproc --qt-support=auto --client 127.0.0.1 --port 5141 --file E:/Project/simple/PaddleOCR/tools/train2.py --config=configs/table/table_mv3_3.yml pydev debugger: process 67916 is connecting

Connected to pydev debugger (build 201.8538.36) [2021/08/16 19:24:01] root INFO: Architecture : [2021/08/16 19:24:01] root INFO: Backbone : [2021/08/16 19:24:01] root INFO: disable_se : True [2021/08/16 19:24:01] root INFO: model_name : small [2021/08/16 19:24:01] root INFO: name : MobileNetV3 [2021/08/16 19:24:01] root INFO: scale : 1.0 [2021/08/16 19:24:01] root INFO: Head : [2021/08/16 19:24:01] root INFO: hidden_size : 256 [2021/08/16 19:24:01] root INFO: l2_decay : 1e-05 [2021/08/16 19:24:01] root INFO: loc_type : 2 [2021/08/16 19:24:01] root INFO: name : TableAttentionHead [2021/08/16 19:24:01] root INFO: algorithm : TableAttn [2021/08/16 19:24:01] root INFO: model_type : table [2021/08/16 19:24:01] root INFO: Eval : [2021/08/16 19:24:01] root INFO: dataset : [2021/08/16 19:24:01] root INFO: data_dir : G:/dataset/pubtabnet/pubtabnet/train/ [2021/08/16 19:24:01] root INFO: label_file_path : G:/dataset/pubtabnet/pubtabnet/PubTabNet_val_2.0.0.jsonl [2021/08/16 19:24:01] root INFO: name : PubTabDataSet [2021/08/16 19:24:01] root INFO: transforms : [2021/08/16 19:24:01] root INFO: DecodeImage : [2021/08/16 19:24:01] root INFO: channel_first : False [2021/08/16 19:24:01] root INFO: img_mode : BGR [2021/08/16 19:24:01] root INFO: ResizeTableImage : [2021/08/16 19:24:01] root INFO: max_len : 1024 [2021/08/16 19:24:01] root INFO: TableLabelEncode : None [2021/08/16 19:24:01] root INFO: NormalizeImage : [2021/08/16 19:24:01] root INFO: mean : [0.485, 0.456, 0.406] [2021/08/16 19:24:01] root INFO: order : hwc [2021/08/16 19:24:01] root INFO: scale : 1./255. [2021/08/16 19:24:01] root INFO: std : [0.229, 0.224, 0.225] [2021/08/16 19:24:01] root INFO: PaddingTableImage : None [2021/08/16 19:24:01] root INFO: ToCHWImage : None [2021/08/16 19:24:01] root INFO: KeepKeys : [2021/08/16 19:24:01] root INFO: keep_keys : ['image', 'structure', 'bbox_list', 'sp_tokens', 'bbox_list_mask'] [2021/08/16 19:24:01] root INFO: loader : [2021/08/16 19:24:01] root INFO: batch_size_per_card : 1 [2021/08/16 19:24:01] root INFO: drop_last : False [2021/08/16 19:24:01] root INFO: num_workers : 1 [2021/08/16 19:24:01] root INFO: shuffle : False [2021/08/16 19:24:01] root INFO: Global : [2021/08/16 19:24:01] root INFO: cal_metric_during_train : True [2021/08/16 19:24:01] root INFO: character_dict_path : ppocr/utils/dict/table_structure_dict.txt [2021/08/16 19:24:01] root INFO: character_type : en [2021/08/16 19:24:01] root INFO: checkpoints : None [2021/08/16 19:24:01] root INFO: debug : False [2021/08/16 19:24:01] root INFO: distributed : False [2021/08/16 19:24:01] root INFO: epoch_num : 50 [2021/08/16 19:24:01] root INFO: eval_batch_step : [0, 400] [2021/08/16 19:24:01] root INFO: infer_img : doc/imgs_words/ch/word_1.jpg [2021/08/16 19:24:01] root INFO: infer_mode : False [2021/08/16 19:24:01] root INFO: log_smooth_window : 20 [2021/08/16 19:24:01] root INFO: max_cell_num : 500 [2021/08/16 19:24:01] root INFO: max_elem_length : 500 [2021/08/16 19:24:01] root INFO: max_text_length : 100 [2021/08/16 19:24:01] root INFO: pretrained_model : None [2021/08/16 19:24:01] root INFO: print_batch_step : 5 [2021/08/16 19:24:01] root INFO: process_cut_num : 0 [2021/08/16 19:24:01] root INFO: process_total_num : 0 [2021/08/16 19:24:01] root INFO: save_epoch_step : 5 [2021/08/16 19:24:01] root INFO: save_inference_dir : None [2021/08/16 19:24:01] root INFO: save_model_dir : ./output/table_mv3/ [2021/08/16 19:24:01] root INFO: use_gpu : False [2021/08/16 19:24:01] root INFO: use_visualdl : False [2021/08/16 19:24:01] root INFO: Loss : [2021/08/16 19:24:01] root INFO: loc_weight : 10000.0 [2021/08/16 19:24:01] root INFO: name : TableAttentionLoss [2021/08/16 19:24:01] root INFO: structure_weight : 100.0 [2021/08/16 19:24:01] root INFO: Metric : [2021/08/16 19:24:01] root INFO: main_indicator : acc [2021/08/16 19:24:01] root INFO: name : TableMetric [2021/08/16 19:24:01] root INFO: Optimizer : [2021/08/16 19:24:01] root INFO: beta1 : 0.9 [2021/08/16 19:24:01] root INFO: beta2 : 0.999 [2021/08/16 19:24:01] root INFO: clip_norm : 5.0 [2021/08/16 19:24:01] root INFO: lr : [2021/08/16 19:24:01] root INFO: learning_rate : 0.0001 [2021/08/16 19:24:01] root INFO: name : Adam [2021/08/16 19:24:01] root INFO: regularizer : [2021/08/16 19:24:01] root INFO: factor : 0.0 [2021/08/16 19:24:01] root INFO: name : L2 [2021/08/16 19:24:01] root INFO: PostProcess : [2021/08/16 19:24:01] root INFO: name : TableLabelDecode [2021/08/16 19:24:01] root INFO: Train : [2021/08/16 19:24:01] root INFO: dataset : [2021/08/16 19:24:01] root INFO: data_dir : G:/dataset/pubtabnet/pubtabnet/train/ [2021/08/16 19:24:01] root INFO: label_file_path : G:/dataset/pubtabnet/pubtabnet/PubTabNet_train_2.0.0.jsonl [2021/08/16 19:24:01] root INFO: name : PubTabDataSet [2021/08/16 19:24:01] root INFO: transforms : [2021/08/16 19:24:01] root INFO: DecodeImage : [2021/08/16 19:24:01] root INFO: channel_first : False [2021/08/16 19:24:01] root INFO: img_mode : BGR [2021/08/16 19:24:01] root INFO: ResizeTableImage : [2021/08/16 19:24:01] root INFO: max_len : 1024 [2021/08/16 19:24:01] root INFO: TableLabelEncode : None [2021/08/16 19:24:01] root INFO: NormalizeImage : [2021/08/16 19:24:01] root INFO: mean : [0.485, 0.456, 0.406] [2021/08/16 19:24:01] root INFO: order : hwc [2021/08/16 19:24:01] root INFO: scale : 1./255. [2021/08/16 19:24:01] root INFO: std : [0.229, 0.224, 0.225] [2021/08/16 19:24:01] root INFO: PaddingTableImage : None [2021/08/16 19:24:01] root INFO: ToCHWImage : None [2021/08/16 19:24:01] root INFO: KeepKeys : [2021/08/16 19:24:01] root INFO: keep_keys : ['image', 'structure', 'bbox_list', 'sp_tokens', 'bbox_list_mask'] [2021/08/16 19:24:01] root INFO: loader : [2021/08/16 19:24:01] root INFO: batch_size_per_card : 2 [2021/08/16 19:24:01] root INFO: drop_last : True [2021/08/16 19:24:01] root INFO: num_workers : 1 [2021/08/16 19:24:01] root INFO: shuffle : True [2021/08/16 19:24:01] root INFO: train with paddle 2.1.2 and device CPUPlace [2021/08/16 19:24:01] root INFO: Initialize indexs of datasets:G:/dataset/pubtabnet/pubtabnet/PubTabNet_train_2.0.0.jsonl [2021/08/16 19:24:07] root INFO: Initialize indexs of datasets:G:/dataset/pubtabnet/pubtabnet/PubTabNet_val_2.0.0.jsonl [2021/08/16 19:24:08] root INFO: train dataloader has 250388 iters [2021/08/16 19:24:08] root INFO: valid dataloader has 9115 iters [2021/08/16 19:24:08] root INFO: During the training process, after the 0th iteration, an evaluation is run every 400 iterations [2021/08/16 19:24:08] root INFO: Initialize indexs of datasets:G:/dataset/pubtabnet/pubtabnet/PubTabNet_train_2.0.0.jsonl Traceback (most recent call last): File "D:\WordInstaller\Anaconda3\envs\PaddleOCR\lib\contextlib.py", line 130, in exit self.gen.throw(type, value, traceback) File "D:\WordInstaller\Anaconda3\envs\PaddleOCR\lib\site-packages\paddle\fluid\dygraph\base.py", line 79, in param_guard yield File "D:\WordInstaller\Anaconda3\envs\PaddleOCR\lib\site-packages\paddle\fluid\dygraph\layers.py", line 902, in call outputs = self.forward(*inputs, kwargs) File "E:\Project\simple\PaddleOCR\ppocr\modeling\architectures\base_model.py", line 81, in forward x = self.head(x, targets=data) File "D:\WordInstaller\Anaconda3\envs\PaddleOCR\lib\site-packages\paddle\fluid\dygraph\layers.py", line 902, in call outputs = self.forward(*inputs, *kwargs) File "E:\Project\simple\PaddleOCR\ppocr\modeling\heads\table_att_head.py", line 85, in forward loc_fea = self.loc_fea_trans(loc_fea) File "D:\WordInstaller\Anaconda3\envs\PaddleOCR\lib\site-packages\paddle\fluid\dygraph\layers.py", line 902, in call outputs = self.forward(inputs, kwargs) File "D:\WordInstaller\Anaconda3\envs\PaddleOCR\lib\site-packages\paddle\nn\layer\common.py", line 129, in forward x=input, weight=self.weight, bias=self.bias, name=self.name) File "D:\WordInstaller\Anaconda3\envs\PaddleOCR\lib\site-packages\paddle\nn\functional\common.py", line 1451, in linear 'transpose_Y', False, "alpha", 1) ValueError: (InvalidArgument) The fisrt matrix width should be same as second matrix height,but received fisrt matrix width 1024, second matrix height 256 [Hint: Expected dima.width == dimb.height, but received dima.width:1024 != dimb.height:256.] (at C:\home\workspace\Paddle_release\paddle/fluid/operators/math/blas_impl.h:1201) [operator < matmul > error]

Process finished with exit code 1

cv-small-snails commented 3 years ago

这个你需要修改一下源代码,里面有hard code

965662766 commented 3 years ago

你好,我有调试过源码,由于刚接触这块没多久,不知道怎么改,可以提供下思路吗?

WenmuZhou commented 3 years ago

训练时修改这个尺寸,需要将这里的256 换成你设置的1024/32*1024/32的大小。https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.2/ppocr/modeling/heads/table_att_head.py#L49

paddle-bot-old[bot] commented 2 years ago

Since you haven\'t replied for more than 3 months, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于您超过三个月未回复,我们将关闭这个issue/pr。 若问题未解决或有后续问题,请随时重新打开(建议先拉取最新代码进行尝试),我们会继续跟进。