open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox
https://mmocr.readthedocs.io/en/dev-1.x/
Apache License 2.0
4.35k stars 751 forks source link

SVTR复现问题 #1682

Open Topdu opened 1 year ago

Topdu commented 1 year ago

Model/Dataset/Scheduler description

非常感谢MMOCR收录SVTR! 目前复现结果与论文存在不小的差距主要存在以下的问题: 1、数据集 SVTR使用与ABINet同样的数据集 2 、数据增强 MMOCR复现SVTR使用的数据增强与原论文使用的数据增强存在较大的diff,这也是造成结果差距大的主要原因 3、学习率和Batchsize SVTR默认使用4卡GPU训练,单个GPU 的batchsize为512,总的batchsize为2048,对应的学习率为0.0005 4、优化器的weight decay SVTR原代码训练时,在PaddleOCR中使用了

  no_weight_decay_name: norm pos_embed
  one_dim_param_no_weight_decay: true

以上AdamW优化的参数设置weight decay

其他有关SVTR训练细节欢迎在PaddleOCR新建issue讨论~

Open source status

Provide useful links for the implementation

No response

mm-assistant[bot] commented 1 year ago

We recommend using English or English & Chinese for issues so that we could have broader discussion.

gaotongxiao commented 1 year ago

Thanks for kindly sharing the insights!

In fact, we have trained SVTR from scratch using PaddleOCR's original implementation as well as the official configs (https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/algorithm_rec_svtr_en.md), but the results are similar to that in MMOCR. We believe that the data augmentation configs are different from the setting used in the paper. Does PaddleOCR have any plan to release the original training config to the community? It can help other researchers learn more from this great paper. Thanks!