PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://paddlepaddle.github.io/PaddleOCR/
Apache License 2.0
44.29k stars 7.83k forks source link

ch_PP-OCRv3_rec_distillation.yml 这个配置文件,怎么改成用resnet 34? 原配置是mobilenet的,可以训练, 但是不知道怎么改成resnet 34? 自己改的报错 #9899

Closed nissansz closed 1 year ago

nissansz commented 1 year ago

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

ch_PP-OCRv3_rec_distillation.yml 这个配置文件,怎么改成用resnet 34? 原配置是mobilenet的,可以训练, 但是不知道怎么改成resnet 34? 自己改的报错

是不是用了distillation yml,就不用另外设置combined loss? combined loss需要的center loss, train_center.pkl, 文件导出是1kb,是导出失败了吧?

andyjiang1116 commented 1 year ago

报错信息是什么呢?

nissansz commented 1 year ago

好像要改loss之类的,不是只改backbone

nissansz commented 1 year ago

Traceback (most recent call last): File "../input/paddle251/PaddleOCR-release-2.5/tools/train.py", line 191, in main(config, device, logger, vdl_writer) File "../input/paddle251/PaddleOCR-release-2.5/tools/train.py", line 166, in main eval_class, pre_best_model_dict, logger, vdl_writer, scaler) File "/kaggle/input/paddle251/PaddleOCR-release-2.5/tools/program.py", line 268, in train loss = loss_class(preds, batch) File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, kwargs) File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(inputs, kwargs) File "/kaggle/input/paddle251/PaddleOCR-release-2.5/ppocr/losses/combined_loss.py", line 55, in forward loss = loss_func(input, batch, kargs) File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(inputs, kwargs) File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, *kwargs) File "/kaggle/input/paddle251/PaddleOCR-release-2.5/ppocr/losses/distillation_loss.py", line 107, in forward out2[self.dis_head]) File "/kaggle/input/paddle251/PaddleOCR-release-2.5/ppocr/losses/basic_loss.py", line 116, in forward self._kldiv(log_out1, out2) + self._kldiv(log_out2, out1)) / 2.0 File "/kaggle/input/paddle251/PaddleOCR-release-2.5/ppocr/losses/basic_loss.py", line 102, in _kldiv loss = target (paddle.log(target + eps) - x) File "/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/math_op_patch.py", line 299, in impl return math_op(self, other_var, 'axis', axis) ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [16, 80, 63466] and the shape of Y = [16, 40, 63466]. Received [80] in X is not equal to [40] in Y at i:1. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/phi/kernels/funcs/common_shape.h:84) [operator < elementwise_sub > error]

andyjiang1116 commented 1 year ago

这个是因为backbone输出的特征维度和原始的不一致,原始应该是40,resnet34是80,可以在backbone最后加一层pool层保持一致

nissansz commented 1 year ago

不会改。可以帮忙改一下,发给能跑通的yml文件试试吗?

andyjiang1116 commented 1 year ago

这个要改源码,这边替换backbone的初衷是啥呢

nissansz commented 1 year ago

因为rec_r34_vd_none_bilstm_ctc.yml 这个配置文件训练速度比mobilenet快很多。 而Distillation好像又能提高准确率,所以想试试rec_r34_Distillation

另外VD还能改善R34准确率?提高1-2%?

andyjiang1116 commented 1 year ago

V3模型提供的是轻量化模型,直接更改结构不保证能提升效果,如果需要大模型,可以试下svtr模型https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/algorithm_rec_svtr.md

nissansz commented 1 year ago

这种用哪个yml配置文件可以训练?也可以像distillation那样互相监督,以改善准确率?

nissansz commented 1 year ago

如果在原rec_r34_vd_none_bilstm_ctc.yml 基础上用combinedloss, 也能达到相互监督改善准确率的效果吗? 自己生成了train_center.pkl,加入centerloss,好像没法用来训练

andyjiang1116 commented 1 year ago

distillation更多是用来提升小模型精度的,你要是只追求高精度,就试试大模型svtr吧

nissansz commented 1 year ago

svtr生成的onnx模型也能和resnet34一样用? 训练速度慢很多?

andyjiang1116 commented 1 year ago

可以试下哈,大模型会慢一些,但是精度高,参考上面的文档

nissansz commented 1 year ago

好的。谢谢。我试试。 后面会有更新的源码,支持resnet34 distillation吗?

nissansz commented 1 year ago

这个rec_svtr_large_10local_11global_stn_ch.yml 不需要做distillation? combinedloss?

andyjiang1116 commented 1 year ago

推荐使用默认官方配置就好,自定义配置需要自己来修改适配哈

nissansz commented 1 year ago

官方配置rec_r34_vd_none_bilstm_ctc.yml ,不改源码, 有什么办法实现监督训练吗?

nissansz commented 1 year ago

用rec_svtr_large_10local_11global_stn_ch.yml 这个没法训练,我之前使用Simple_Dataset.py 改了配置文件,就报错了

[2023/05/11 09:22:25] ppocr INFO: train with paddle 2.3.2 and device Place(gpu:0) Traceback (most recent call last): File "../input/paddle251/PaddleOCR-release-2.5/tools/train.py", line 191, in main(config, device, logger, vdl_writer) File "../input/paddle251/PaddleOCR-release-2.5/tools/train.py", line 52, in main train_dataloader = build_dataloader(config, 'Train', device, logger) File "/kaggle/input/paddle251/PaddleOCR-release-2.5/ppocr/data/init.py", line 61, in build_dataloader 'DataSet only support {}'.format(support_dict)) AssertionError: DataSet only support ['SimpleDataSet', 'LMDBDataSet', 'PGDataSet', 'PubTabDataSet']

nissansz commented 1 year ago

可以帮忙修改源码,发一个resnet34 distillation的配置文件 和 支持该配置文件的py文件吗?

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.