harshsummit commented 1 year ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：Ubuntu 20.04
版本号/Version：Paddle：2.4.1 PaddleOCR：2.6 release 问题相关组件/Related components：
运行指令/Command Code：python tools/train.py -c configs/kie/vi_layoutxlm/re_layoutxlm_funsd.yml
完整报错/Complete Error Message：TypeError: list indices must be integers or slices, not tuple

error is in the following piece of code in vqa_token_re_layoutlm_postprocess.py file which occurs during Postprocess

harshsummit commented 1 year ago

the yml config use is

` Global: use_gpu: False epoch_num: &epoch_num 200 log_smooth_window: 10 print_batch_step: 10 save_model_dir: ./output/re_layoutxlm_funsd save_epoch_step: 2000

evaluation is run every 10 iterations after the 0th iteration

eval_batch_step: [ 0, 19 ] cal_metric_during_train: False save_inference_dir: use_visualdl: False seed: 2022 infer_img: train_data/FUNSD/testing_data/images/83624198.png save_res_path: ./output/re_layoutxlm_funsd/res/

Architecture: model_type: kie algorithm: &algorithm "LayoutXLM" Transform: Backbone: name: LayoutXLMForRe pretrained: True checkpoints:

Loss: name: LossFromOutput key: loss reduction: mean

Optimizer: name: AdamW beta1: 0.9 beta2: 0.999 clip_norm: 10 lr: learning_rate: 0.00005 warmup_epoch: 10 regularizer: name: L2 factor: 0.00000

PostProcess: name: VQAReTokenLayoutLMPostProcess

Metric: name: VQAReTokenMetric main_indicator: hmean

Train: dataset: name: SimpleDataSet data_dir: ./train_data/FUNSD/training_data/images/ label_file_list:

./train_data/FUNSD/train.json
- ./train_data/FUNSD/train.json

ratio_list: [ 1.0 ] transforms:
DecodeImage: # load image img_mode: RGB channel_first: False
VQATokenLabelEncode: # Class handling label contains_re: True algorithm: *algorithm class_path: &class_path ./train_data/FUNSD/class_list.txt use_textline_bbox_info: &use_textline_bbox_info True
VQATokenPad: max_seq_len: &max_seq_len 512 return_attention_mask: True
VQAReTokenRelation:
VQAReTokenChunk: max_seq_len: *max_seq_len
Resize: size: [224,224]
NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc'
ToCHWImage:
KeepKeys:
dataloader will return list in this order
```
  keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'entities', 'relations']
```
loader: shuffle: False drop_last: False batch_size_per_card: 2 num_workers: 4 collate_fn: ListCollator

Eval: dataset: name: SimpleDataSet data_dir: ./train_data/FUNSD/testing_data/images/ label_file_list:

./train_data/FUNSD/test.json
- ./train_data/FUNSD/test.json

transforms:
DecodeImage: # load image img_mode: RGB channel_first: False
VQATokenLabelEncode: # Class handling label contains_re: True algorithm: algorithm class_path: ./train_data/FUNSD/class_list.txt use_textline_bbox_info: use_textline_bbox_info
VQATokenPad: max_seq_len: *max_seq_len return_attention_mask: True
VQAReTokenRelation:
VQAReTokenChunk: max_seq_len: *max_seq_len
Resize: size: [224,224]
NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc'
ToCHWImage:
KeepKeys:
dataloader will return list in this order
```
  keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'entities', 'relations']
```
loader: shuffle: False drop_last: False batch_size_per_card: 8 num_workers: 4 collate_fn: ListCollator `

andyjiang1116 commented 1 year ago

please try to update paddlenlp package

harshsummit commented 1 year ago

please try to update paddlenlp package

Tried that too, the problem is with the line 61, in the file you may see in the ss

andyjiang1116 commented 1 year ago

which version did you use?

harshsummit commented 1 year ago

tried 2.3.0, 2.5.0 and 2.4.0 as well

andyjiang1116 commented 1 year ago

try 2.3.1?

harshsummit commented 1 year ago

doesn't seem to be error for paddlenlp, if u see the error is for accessing the pred_relation variable which is a numpy array, and to access it we are accessing it as pred_relation[0,0,0] + 1, this doesnt give a required output for slicing in python

andyjiang1116 commented 1 year ago

I tried 2.4.4 and it works well, layoutxml depends on the paddlenlp

harshsummit commented 1 year ago

Tried, its error with numpy version

harshsummit commented 1 year ago

the array accessing works at 1.21.5 but it gives error for polygon dependency as it requires 1.23.5

harshsummit commented 1 year ago

ig we need new code for the file depending upon new numpy versions

harshsummit commented 1 year ago

@andyjpaddle do you have any idea what does this .numpy() function do

I suspect this is not converting the array into a numpy array, and thats where the error occurs

andyjiang1116 commented 1 year ago

what does this .numpy() function do

this function is convert the tensor into a numpy array, so it can be used in subsequent calculation

harshsummit commented 1 year ago

what does this .numpy() function do

this function is convert the tensor into a numpy array, so it can be used in subsequent calculation @andyjpaddle I checked logging the type of preds is of type dict, and preds_relation is of type list and its empty -> [[] [] [] [] []], therefore it is not able to detect it as tensor or convert it into numpy

andyjiang1116 commented 1 year ago

did you test the official model? i tested it and it works well

harshsummit commented 1 year ago

official model is on xfund dataset, i was trying for FUNSD dataset,

as there is no model available for FUNSD dataset

harshsummit commented 1 year ago

Do you have any pre trained FUNSD KIE model available?

WenmuZhou commented 1 year ago

we only release the model trained on xfund. for FUNSD, does you convert the dataset to xfund format?

harshsummit commented 1 year ago

yes i did

WenmuZhou commented 1 year ago

You can try paddlenlp>=2.4.1, in versions after 2.4.1, we have made relatively large changes to the code to support dynamic to static

harshsummit commented 1 year ago

Tried that too @WenmuZhou

harshsummit commented 1 year ago

@WenmuZhou @andyjpaddle on compiling PaddleNLP to latest version, It gives Floating Point Exception (Core Dumped)

harshsummit commented 1 year ago

I really think there is some problem in training the KIE models with FUNSD dataset, please look into it, there are already open issues regarding FUNSD dataset from past 1 years which are unresolved.

I dont think anybody has till date sucessfully trained kie model with FUNSD dataset and then used it for inference

harshsummit commented 1 year ago

@andyjpaddle @WenmuZhou any update???

NaumanHSA commented 1 year ago

@andyjpaddle @WenmuZhou Have you guys any update regarding training KIE on FUNSD dataset? I'm facing the "Core Dumped" exception. Also, can you guys tell me how much GPU memory is required for KIE training?

harshsummit commented 1 year ago

Same here @NaumanHSA , there is someproblem with code, Ive been trying since 2 weeks for the same

NaumanHSA commented 1 year ago

@harshsummit I tried training KEI model on XFUND dataset, and it started normally. I think there's an issue with the FUNSD dataset but I don't know how to figure it out? About the memory thing, with desktop RTX 4070 ti, CUDA 11.7, cuDNN 8.4, Ubuntu 20.04, I was able to start KIE training on XFUND dataset.

harshsummit commented 1 year ago

@NaumanHSA exactly same, tried for XFUND works, but for FUNSD throws error

NaumanHSA commented 1 year ago

@harshsummit I tried on a custom dataset with a very few images, it started as normal. Then I verified the FUNSD dataset by just reading the labels, rendering the boxes and text using opencv, to check if there could be some incorrect box annotations but all of the images were labeled correctly.

In my second test, I found that some special characters in the dataset were encoded as utf8. I thought that might be an issue (was not sure but gave it a try to remove all those characters from the dataset). Still got the same error. I think this problem is with the paddleNLP package.

munish0838 commented 1 year ago

Any updates? @harshsummit @NaumanHSA

NaumanHSA commented 1 year ago

@munish0838 so far no luck with FUNSD dataset. I've labelled a custom dataset of 50 images. Now I'm facing issues with training. Its accuracy is always 0 lol.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

PaddlePaddle / PaddleOCR

vqa_token_re_layoutlm_postprocess error #8995

evaluation is run every 10 iterations after the 0th iteration

- ./train_data/FUNSD/train.json

dataloader will return list in this order

- ./train_data/FUNSD/test.json

dataloader will return list in this order