GXYM / TextBPN-Plus-Plus

Arbitrary Shape Text Detection via Boundary Transformer;The paper at: https://arxiv.org/abs/2205.05320, which has been accepted by IEEE Transactions on Multimedia (T-MM 2023).
172 stars 37 forks source link

CTW1500复现问题 #9

Open AlphaNext opened 1 year ago

AlphaNext commented 1 year ago

感谢作者的工作和代码,以下是个人的复现过程:

复现结果与论文结果差距比较大,请问是可能什么原因造成的,或者在复现过程中应该注意什么? ps:基本可以排除训练环境的问题,因为对主页提供的模型测试后与论文提供结果差异较小。

期待您的答复,祝好。

GXYM commented 1 year ago

感谢作者的工作和代码,以下是个人的复现过程:

  • 复现的模型是:Res50-1s
  • 训练数据:CTW1500的1000张训练数据,未使用其他额外数据
  • 训练启动脚本:python3 -u train_textBPN_author.py --exp_name Ctw1500 --net resnet50 --scale 1 --max_epoch 660 --batch_size 12 --gpu 0 --input_size 640 --optim Adam --lr 0.001 --num_workers 8 --save_dir model/ctwr50_author
  • 测试集上模型批量测试的最佳结果:AP: 0.6850, recall: 0.8106, pred: 0.8360, FM: 0.8231
  • 论文中展示的结果:recall: 81.12, precision: 88.08, f-measure: 84.46

复现结果与论文结果差距比较大,请问是可能什么原因造成的,或者在复现过程中应该注意什么? ps:基本可以排除训练环境的问题,因为对主页提供的模型测试后与论文提供结果差异较小。

期待您的答复,祝好。

你使用了什么测试参数?旧版的CTW1500由于标注噪声比较多,如果收敛的不好(最终loss应该在0.7甚至0.6以内),训练时可以试着减小学习率(0.0001 )。测试过程中,测试尺寸640-1024,测试参数cls_threshold,dis_threshold需要适当调整,看起来你的测试结果检测的精度很低,应该是粘连太多,或者检测结果的噪声较多导致的。

AlphaNext commented 1 year ago

感谢作者的工作和代码,以下是个人的复现过程:

  • 复现的模型是:Res50-1s
  • 训练数据:CTW1500的1000张训练数据,未使用其他额外数据
  • 训练启动脚本:python3 -u train_textBPN_author.py --exp_name Ctw1500 --net resnet50 --scale 1 --max_epoch 660 --batch_size 12 --gpu 0 --input_size 640 --optim Adam --lr 0.001 --num_workers 8 --save_dir model/ctwr50_author
  • 测试集上模型批量测试的最佳结果:AP: 0.6850, recall: 0.8106, pred: 0.8360, FM: 0.8231
  • 论文中展示的结果:recall: 81.12, precision: 88.08, f-measure: 84.46

复现结果与论文结果差距比较大,请问是可能什么原因造成的,或者在复现过程中应该注意什么? ps:基本可以排除训练环境的问题,因为对主页提供的模型测试后与论文提供结果差异较小。 期待您的答复,祝好。

你使用了什么测试参数?旧版的CTW1500由于标注噪声比较多,如果收敛的不好(最终loss应该在0.7甚至0.6以内),训练时可以试着减小学习率(0.0001 )。测试过程中,测试尺寸640-1024,测试参数cls_threshold,dis_threshold需要适当调整,看起来你的测试结果检测的精度很低,应该是粘连太多,或者检测结果的噪声较多导致的。

GXYM commented 1 year ago

感谢作者的工作和代码,以下是个人的复现过程:

  • 复现的模型是:Res50-1s
  • 训练数据:CTW1500的1000张训练数据,未使用其他额外数据
  • 训练启动脚本:python3 -u train_textBPN_author.py --exp_name Ctw1500 --net resnet50 --scale 1 --max_epoch 660 --batch_size 12 --gpu 0 --input_size 640 --optim Adam --lr 0.001 --num_workers 8 --save_dir model/ctwr50_author
  • 测试集上模型批量测试的最佳结果:AP: 0.6850, recall: 0.8106, pred: 0.8360, FM: 0.8231
  • 论文中展示的结果:recall: 81.12, precision: 88.08, f-measure: 84.46

复现结果与论文结果差距比较大,请问是可能什么原因造成的,或者在复现过程中应该注意什么? ps:基本可以排除训练环境的问题,因为对主页提供的模型测试后与论文提供结果差异较小。 期待您的答复,祝好。

你使用了什么测试参数?旧版的CTW1500由于标注噪声比较多,如果收敛的不好(最终loss应该在0.7甚至0.6以内),训练时可以试着减小学习率(0.0001 )。测试过程中,测试尺寸640-1024,测试参数cls_threshold,dis_threshold需要适当调整,看起来你的测试结果检测的精度很低,应该是粘连太多,或者检测结果的噪声较多导致的。

  • 测试脚本:python3 eval_textBPN.py --net resnet50 --scale 1 --exp_name Ctw1500 --checkepoch $i --test_size 640 1024 --dis_threshold 0.375 --cls_threshold 0.8 --gpu 0 --save_dir model/ctwr50_author
  • 请问主页上贴的三个Dataset links里应该是新版的数据吧?
  • 试着调整cls_threshold和dis_threshold了,影响不大,还是比较低
  • 训练时减小学习率是指在一开始就设置为0.0001,还是在中途某个迭代节点上?如果是后者,请问大概在多少个epoch?
  • 最后部分的训练日志:
Epoch: 657 : LR = [0.00025418658283290005]
(0 / 84)  total_loss: 0.7954  cls_loss: 0.0888  distance loss: 0.0523  dir_loss: 0.1689  norm_loss: 0.0652  angle_loss: 0.1037  point_loss: 0.4392  energy_loss: 0.0463
(10 / 84)  total_loss: 0.7417  cls_loss: 0.0962  distance loss: 0.0635  dir_loss: 0.1864  norm_loss: 0.0697  angle_loss: 0.1168  point_loss: 0.3344  energy_loss: 0.0612
(20 / 84)  total_loss: 0.8262  cls_loss: 0.1208  distance loss: 0.1462  dir_loss: 0.2341  norm_loss: 0.0908  angle_loss: 0.1434  point_loss: 0.2591  energy_loss: 0.0660
(30 / 84)  total_loss: 1.1567  cls_loss: 0.1796  distance loss: 0.1425  dir_loss: 0.4487  norm_loss: 0.1507  angle_loss: 0.2980  point_loss: 0.3081  energy_loss: 0.0777
(40 / 84)  total_loss: 0.7023  cls_loss: 0.1005  distance loss: 0.0703  dir_loss: 0.2052  norm_loss: 0.0827  angle_loss: 0.1226  point_loss: 0.2428  energy_loss: 0.0835
(50 / 84)  total_loss: 0.8097  cls_loss: 0.1137  distance loss: 0.0639  dir_loss: 0.3087  norm_loss: 0.1542  angle_loss: 0.1545  point_loss: 0.2315  energy_loss: 0.0919
(60 / 84)  total_loss: 0.7268  cls_loss: 0.1478  distance loss: 0.0826  dir_loss: 0.1872  norm_loss: 0.0609  angle_loss: 0.1264  point_loss: 0.2714  energy_loss: 0.0377
(70 / 84)  total_loss: 0.8770  cls_loss: 0.1274  distance loss: 0.0813  dir_loss: 0.1966  norm_loss: 0.0506  angle_loss: 0.1461  point_loss: 0.4129  energy_loss: 0.0588
(80 / 84)  total_loss: 0.8095  cls_loss: 0.1638  distance loss: 0.0978  dir_loss: 0.2501  norm_loss: 0.0893  angle_loss: 0.1608  point_loss: 0.2512  energy_loss: 0.0466
Training Loss: 0.8728443916354861
Epoch: 658 : LR = [0.00025418658283290005]
(0 / 84)  total_loss: 0.8516  cls_loss: 0.1082  distance loss: 0.0855  dir_loss: 0.2301  norm_loss: 0.0997  angle_loss: 0.1304  point_loss: 0.3661  energy_loss: 0.0618
(10 / 84)  total_loss: 0.8733  cls_loss: 0.1215  distance loss: 0.0623  dir_loss: 0.2647  norm_loss: 0.1153  angle_loss: 0.1494  point_loss: 0.3486  energy_loss: 0.0762
(20 / 84)  total_loss: 0.6975  cls_loss: 0.0897  distance loss: 0.0683  dir_loss: 0.1933  norm_loss: 0.0847  angle_loss: 0.1087  point_loss: 0.2797  energy_loss: 0.0664
(30 / 84)  total_loss: 0.8202  cls_loss: 0.1280  distance loss: 0.1359  dir_loss: 0.2398  norm_loss: 0.0931  angle_loss: 0.1467  point_loss: 0.2452  energy_loss: 0.0713
(40 / 84)  total_loss: 0.9599  cls_loss: 0.1528  distance loss: 0.1663  dir_loss: 0.2443  norm_loss: 0.0711  angle_loss: 0.1733  point_loss: 0.3423  energy_loss: 0.0541
(50 / 84)  total_loss: 0.7622  cls_loss: 0.1147  distance loss: 0.0933  dir_loss: 0.2908  norm_loss: 0.1337  angle_loss: 0.1571  point_loss: 0.1853  energy_loss: 0.0780
(60 / 84)  total_loss: 0.7921  cls_loss: 0.1299  distance loss: 0.0622  dir_loss: 0.2284  norm_loss: 0.0830  angle_loss: 0.1454  point_loss: 0.3113  energy_loss: 0.0602
(70 / 84)  total_loss: 0.9626  cls_loss: 0.1319  distance loss: 0.0923  dir_loss: 0.2871  norm_loss: 0.1208  angle_loss: 0.1663  point_loss: 0.3565  energy_loss: 0.0949
(80 / 84)  total_loss: 0.9376  cls_loss: 0.1177  distance loss: 0.0967  dir_loss: 0.2468  norm_loss: 0.0814  angle_loss: 0.1654  point_loss: 0.4058  energy_loss: 0.0706
Training Loss: 0.8835705071687698
Epoch: 659 : LR = [0.00025418658283290005]
(0 / 84)  total_loss: 1.0344  cls_loss: 0.1422  distance loss: 0.1038  dir_loss: 0.3013  norm_loss: 0.1033  angle_loss: 0.1980  point_loss: 0.4069  energy_loss: 0.0801
(10 / 84)  total_loss: 0.6888  cls_loss: 0.0997  distance loss: 0.0626  dir_loss: 0.1791  norm_loss: 0.0722  angle_loss: 0.1069  point_loss: 0.2888  energy_loss: 0.0585
(20 / 84)  total_loss: 0.8901  cls_loss: 0.1157  distance loss: 0.1085  dir_loss: 0.2672  norm_loss: 0.1219  angle_loss: 0.1453  point_loss: 0.3436  energy_loss: 0.0550
(30 / 84)  total_loss: 0.8983  cls_loss: 0.1169  distance loss: 0.0746  dir_loss: 0.2258  norm_loss: 0.1015  angle_loss: 0.1242  point_loss: 0.4155  energy_loss: 0.0655
(40 / 84)  total_loss: 0.8363  cls_loss: 0.1456  distance loss: 0.0747  dir_loss: 0.2174  norm_loss: 0.0606  angle_loss: 0.1568  point_loss: 0.3427  energy_loss: 0.0560
(50 / 84)  total_loss: 0.8382  cls_loss: 0.1067  distance loss: 0.0689  dir_loss: 0.2860  norm_loss: 0.1354  angle_loss: 0.1506  point_loss: 0.3112  energy_loss: 0.0653
(60 / 84)  total_loss: 0.8165  cls_loss: 0.1263  distance loss: 0.0786  dir_loss: 0.2594  norm_loss: 0.0942  angle_loss: 0.1652  point_loss: 0.2974  energy_loss: 0.0548
(70 / 84)  total_loss: 0.6886  cls_loss: 0.1307  distance loss: 0.0809  dir_loss: 0.1977  norm_loss: 0.0831  angle_loss: 0.1146  point_loss: 0.2313  energy_loss: 0.0480
(80 / 84)  total_loss: 0.8294  cls_loss: 0.1210  distance loss: 0.0735  dir_loss: 0.2284  norm_loss: 0.0984  angle_loss: 0.1300  point_loss: 0.3452  energy_loss: 0.0612
Training Loss: 0.8670404915298734
Epoch: 660 : LR = [0.00025418658283290005]
(0 / 84)  total_loss: 0.7123  cls_loss: 0.1101  distance loss: 0.1007  dir_loss: 0.2134  norm_loss: 0.0764  angle_loss: 0.1369  point_loss: 0.2376  energy_loss: 0.0505
(10 / 84)  total_loss: 0.8336  cls_loss: 0.1729  distance loss: 0.0886  dir_loss: 0.2166  norm_loss: 0.0935  angle_loss: 0.1231  point_loss: 0.2921  energy_loss: 0.0634
(20 / 84)  total_loss: 0.8247  cls_loss: 0.1354  distance loss: 0.0801  dir_loss: 0.2609  norm_loss: 0.1053  angle_loss: 0.1556  point_loss: 0.2494  energy_loss: 0.0990
(30 / 84)  total_loss: 1.2477  cls_loss: 0.3560  distance loss: 0.1647  dir_loss: 0.2676  norm_loss: 0.1071  angle_loss: 0.1605  point_loss: 0.4074  energy_loss: 0.0520
(40 / 84)  total_loss: 0.9471  cls_loss: 0.1442  distance loss: 0.0899  dir_loss: 0.2888  norm_loss: 0.1233  angle_loss: 0.1655  point_loss: 0.3522  energy_loss: 0.0720
(50 / 84)  total_loss: 0.8846  cls_loss: 0.1098  distance loss: 0.0621  dir_loss: 0.2655  norm_loss: 0.1158  angle_loss: 0.1497  point_loss: 0.3761  energy_loss: 0.0710
(60 / 84)  total_loss: 0.8210  cls_loss: 0.1140  distance loss: 0.1277  dir_loss: 0.2597  norm_loss: 0.1176  angle_loss: 0.1421  point_loss: 0.2609  energy_loss: 0.0587
(70 / 84)  total_loss: 0.7683  cls_loss: 0.1064  distance loss: 0.0778  dir_loss: 0.2260  norm_loss: 0.1112  angle_loss: 0.1148  point_loss: 0.3010  energy_loss: 0.0571
(80 / 84)  total_loss: 0.7275  cls_loss: 0.1326  distance loss: 0.0672  dir_loss: 0.2097  norm_loss: 0.0878  angle_loss: 0.1219  point_loss: 0.2733  energy_loss: 0.0446
Saving to model/ctwr50_author/Ctw1500/TextBPN_res50_660.pth.
Training Loss: 0.8563408780665624
End.

主页上贴的三个Dataset links里应该是旧版本的,新版本请参考官方网站:https://github.com/cs-chan/Total-Text-Dataset;https://github.com/Yuliang-Liu/Curve-Text-Detector 看起来你的训练过程收敛的不是很好,你可以尝试调一下学习率,可以从一开始就把学习率设置为0.0001在CTW-1500, 旧版的CTW-1500噪声很多,加上随机的数据增广,训练过程可能会出现震荡的情况;出现loss先减小,突然增大,又开始减小这种情况。建议换成新版本的CTW-1500数据集。

AlphaNext commented 1 year ago

按照您说的办法尝试了,但还是复现不出来,稍等些时间,我把相关的日志贴出来~

GXYM commented 1 year ago

按照您说的办法尝试了,但还是复现不出来,稍等些时间,我把相关的日志贴出来~

你用了新版本数据集,还是只改了学习率?我在查查原始代码,用这个版本的代码跑一下这个实验,对比一下,看看是不是这个版本的代码合并过程存在什么bug,还是哪些设置影响的。

AlphaNext commented 1 year ago

按照您说的办法尝试了,但还是复现不出来,稍等些时间,我把相关的日志贴出来~

你用了新版本数据集,还是只改了学习率?我在查查原始代码,用这个版本的代码跑一下这个实验,对比一下,看看是不是这个版本的代码合并过程存在什么bug,还是哪些设置影响的。

辛苦作者检查了,非常感谢!

GXYM commented 1 year ago

按照您说的办法尝试了,但还是复现不出来,稍等些时间,我把相关的日志贴出来~

你用了新版本数据集,还是只改了学习率?我在查查原始代码,用这个版本的代码跑一下这个实验,对比一下,看看是不是这个版本的代码合并过程存在什么bug,还是哪些设置影响的。

ctwdataset

辛苦作者检查了,非常感谢!

我们已经初步找到了问题所在,可以肯定的是CTW1500上的实验结果是可以复现的,甚至我们复现的结果比报告的结果更好(因为随机数据增广的原因,实验的最好结果存在小幅的波动)。为了保证代码在不同数据集上的一致性,我们还将测试一下在Total-Text上的复现效果。之后我们将将更新代码,并整理一下代码的变化,释放测试记录。请关注,代码近期的更新情况、

直接跑训练测试脚本复现的测试日志如下:

640_1024_100 0.35/0.825 ALL :: AP=0.6632 - Precision=0.8515 - Recall=0.7722 - Fscore=0.8099
640_1024_105 0.35/0.825 ALL :: AP=0.6544 - Precision=0.8395 - Recall=0.7774 - Fscore=0.8072
640_1024_110 0.35/0.825 ALL :: AP=0.6883 - Precision=0.8492 - Recall=0.8018 - Fscore=0.8248
640_1024_115 0.35/0.825 ALL :: AP=0.6657 - Precision=0.8621 - Recall=0.7663 - Fscore=0.8114
640_1024_120 0.35/0.825 ALL :: AP=0.6580 - Precision=0.8451 - Recall=0.7699 - Fscore=0.8057
640_1024_125 0.35/0.825 ALL :: AP=0.6577 - Precision=0.8537 - Recall=0.7683 - Fscore=0.8087
640_1024_130 0.35/0.825 ALL :: AP=0.6809 - Precision=0.8626 - Recall=0.7839 - Fscore=0.8214
640_1024_135 0.35/0.825 ALL :: AP=0.6669 - Precision=0.8518 - Recall=0.7754 - Fscore=0.8118
640_1024_140 0.35/0.825 ALL :: AP=0.6896 - Precision=0.8660 - Recall=0.7920 - Fscore=0.8274
640_1024_145 0.35/0.825 ALL :: AP=0.6684 - Precision=0.8572 - Recall=0.7751 - Fscore=0.8141
640_1024_150 0.35/0.825 ALL :: AP=0.6791 - Precision=0.8726 - Recall=0.7705 - Fscore=0.8184
640_1024_155 0.35/0.825 ALL :: AP=0.6903 - Precision=0.8451 - Recall=0.8093 - Fscore=0.8268
640_1024_160 0.35/0.825 ALL :: AP=0.6880 - Precision=0.8580 - Recall=0.7937 - Fscore=0.8246
640_1024_165 0.35/0.825 ALL :: AP=0.6772 - Precision=0.8463 - Recall=0.7953 - Fscore=0.8200
640_1024_170 0.35/0.825 ALL :: AP=0.6857 - Precision=0.8497 - Recall=0.8018 - Fscore=0.8251
640_1024_175 0.35/0.825 ALL :: AP=0.6968 - Precision=0.8787 - Recall=0.7885 - Fscore=0.8311
640_1024_180 0.35/0.825 ALL :: AP=0.6832 - Precision=0.8473 - Recall=0.7995 - Fscore=0.8227
640_1024_185 0.35/0.825 ALL :: AP=0.6979 - Precision=0.8508 - Recall=0.8123 - Fscore=0.8311
640_1024_190 0.35/0.825 ALL :: AP=0.6934 - Precision=0.8686 - Recall=0.7927 - Fscore=0.8289
640_1024_195 0.35/0.825 ALL :: AP=0.6944 - Precision=0.8657 - Recall=0.7966 - Fscore=0.8297
640_1024_200 0.35/0.825 ALL :: AP=0.6974 - Precision=0.8494 - Recall=0.8165 - Fscore=0.8326
640_1024_205 0.35/0.825 ALL :: AP=0.6833 - Precision=0.8523 - Recall=0.7956 - Fscore=0.8230
640_1024_210 0.35/0.825 ALL :: AP=0.6821 - Precision=0.8675 - Recall=0.7790 - Fscore=0.8209
640_1024_215 0.35/0.825 ALL :: AP=0.7005 - Precision=0.8669 - Recall=0.7986 - Fscore=0.8314
640_1024_220 0.35/0.825 ALL :: AP=0.6879 - Precision=0.8670 - Recall=0.7859 - Fscore=0.8244
640_1024_225 0.35/0.825 ALL :: AP=0.6713 - Precision=0.8678 - Recall=0.7663 - Fscore=0.8139
640_1024_230 0.35/0.825 ALL :: AP=0.6994 - Precision=0.8444 - Recall=0.8175 - Fscore=0.8307
640_1024_235 0.35/0.825 ALL :: AP=0.7070 - Precision=0.8759 - Recall=0.8008 - Fscore=0.8367
640_1024_240 0.35/0.825 ALL :: AP=0.6847 - Precision=0.8722 - Recall=0.7787 - Fscore=0.8228
640_1024_245 0.35/0.825 ALL :: AP=0.6970 - Precision=0.8541 - Recall=0.8093 - Fscore=0.8311
640_1024_250 0.35/0.825 ALL :: AP=0.7015 - Precision=0.8534 - Recall=0.8119 - Fscore=0.8321
640_1024_255 0.35/0.825 ALL :: AP=0.6975 - Precision=0.8485 - Recall=0.8106 - Fscore=0.8291
640_1024_260 0.35/0.825 ALL :: AP=0.6754 - Precision=0.8572 - Recall=0.7787 - Fscore=0.8161
640_1024_265 0.35/0.825 ALL :: AP=0.6902 - Precision=0.8610 - Recall=0.7953 - Fscore=0.8268
640_1024_270 0.35/0.825 ALL :: AP=0.6844 - Precision=0.8678 - Recall=0.7829 - Fscore=0.8232
640_1024_275 0.35/0.825 ALL :: AP=0.7021 - Precision=0.8690 - Recall=0.8018 - Fscore=0.8340
640_1024_280 0.35/0.825 ALL :: AP=0.6991 - Precision=0.8442 - Recall=0.8198 - Fscore=0.8318
640_1024_285 0.35/0.825 ALL :: AP=0.7004 - Precision=0.8603 - Recall=0.8067 - Fscore=0.8326
640_1024_290 0.35/0.825 ALL :: AP=0.6953 - Precision=0.8449 - Recall=0.8129 - Fscore=0.8286
640_1024_295 0.35/0.825 ALL :: AP=0.7120 - Precision=0.8629 - Recall=0.8207 - Fscore=0.8413
640_1024_300 0.35/0.825 ALL :: AP=0.7091 - Precision=0.8422 - Recall=0.8315 - Fscore=0.8368
640_1024_305 0.35/0.825 ALL :: AP=0.7105 - Precision=0.8420 - Recall=0.8357 - Fscore=0.8389
640_1024_310 0.35/0.825 ALL :: AP=0.7002 - Precision=0.8469 - Recall=0.8220 - Fscore=0.8343
640_1024_315 0.35/0.825 ALL :: AP=0.6916 - Precision=0.8551 - Recall=0.8018 - Fscore=0.8276
640_1024_320 0.35/0.825 ALL :: AP=0.7021 - Precision=0.8529 - Recall=0.8142 - Fscore=0.8331
640_1024_325 0.35/0.825 ALL :: AP=0.7042 - Precision=0.8406 - Recall=0.8269 - Fscore=0.8337
640_1024_330 0.35/0.825 ALL :: AP=0.7052 - Precision=0.8578 - Recall=0.8139 - Fscore=0.8353
640_1024_335 0.35/0.825 ALL :: AP=0.7005 - Precision=0.8608 - Recall=0.8083 - Fscore=0.8338
640_1024_340 0.35/0.825 ALL :: AP=0.6979 - Precision=0.8683 - Recall=0.7976 - Fscore=0.8315
640_1024_345 0.35/0.825 ALL :: AP=0.6818 - Precision=0.8432 - Recall=0.7995 - Fscore=0.8208
640_1024_350 0.35/0.825 ALL :: AP=0.6933 - Precision=0.8465 - Recall=0.8090 - Fscore=0.8273
640_1024_355 0.35/0.825 ALL :: AP=0.6955 - Precision=0.8534 - Recall=0.8067 - Fscore=0.8294
640_1024_360 0.35/0.825 ALL :: AP=0.6930 - Precision=0.8762 - Recall=0.7819 - Fscore=0.8264
640_1024_365 0.35/0.825 ALL :: AP=0.6974 - Precision=0.8499 - Recall=0.8123 - Fscore=0.8307
640_1024_370 0.35/0.825 ALL :: AP=0.6947 - Precision=0.8456 - Recall=0.8139 - Fscore=0.8294
640_1024_375 0.35/0.825 ALL :: AP=0.6806 - Precision=0.8414 - Recall=0.7989 - Fscore=0.8196
640_1024_380 0.35/0.825 ALL :: AP=0.6957 - Precision=0.8538 - Recall=0.8070 - Fscore=0.8298
640_1024_385 0.35/0.825 ALL :: AP=0.6993 - Precision=0.8491 - Recall=0.8158 - Fscore=0.8321
640_1024_390 0.35/0.825 ALL :: AP=0.6997 - Precision=0.8530 - Recall=0.8113 - Fscore=0.8316
640_1024_395 0.35/0.825 ALL :: AP=0.6755 - Precision=0.8327 - Recall=0.8015 - Fscore=0.8168
640_1024_400 0.35/0.825 ALL :: AP=0.7079 - Precision=0.8689 - Recall=0.8057 - Fscore=0.8361
640_1024_405 0.35/0.825 ALL :: AP=0.6922 - Precision=0.8531 - Recall=0.8048 - Fscore=0.8282
640_1024_410 0.35/0.825 ALL :: AP=0.7096 - Precision=0.8579 - Recall=0.8204 - Fscore=0.8387
640_1024_415 0.35/0.825 ALL :: AP=0.7033 - Precision=0.8450 - Recall=0.8230 - Fscore=0.8339
640_1024_420 0.35/0.825 ALL :: AP=0.6921 - Precision=0.8343 - Recall=0.8220 - Fscore=0.8281
640_1024_425 0.35/0.825 ALL :: AP=0.6949 - Precision=0.8466 - Recall=0.8129 - Fscore=0.8294
640_1024_430 0.35/0.825 ALL :: AP=0.6924 - Precision=0.8455 - Recall=0.8100 - Fscore=0.8274
640_1024_435 0.35/0.825 ALL :: AP=0.6899 - Precision=0.8387 - Recall=0.8119 - Fscore=0.8251
640_1024_440 0.35/0.825 ALL :: AP=0.7132 - Precision=0.8779 - Recall=0.8061 - Fscore=0.8404
640_1024_445 0.35/0.825 ALL :: AP=0.6952 - Precision=0.8301 - Recall=0.8331 - Fscore=0.8316
640_1024_450 0.35/0.825 ALL :: AP=0.7022 - Precision=0.8593 - Recall=0.8083 - Fscore=0.8331
640_1024_455 0.35/0.825 ALL :: AP=0.6989 - Precision=0.8478 - Recall=0.8155 - Fscore=0.8314
640_1024_460 0.35/0.825 ALL :: AP=0.7011 - Precision=0.8488 - Recall=0.8181 - Fscore=0.8332
640_1024_465 0.35/0.825 ALL :: AP=0.7035 - Precision=0.8504 - Recall=0.8191 - Fscore=0.8345
640_1024_470 0.35/0.825 ALL :: AP=0.6974 - Precision=0.8583 - Recall=0.8057 - Fscore=0.8312
640_1024_475 0.35/0.825 ALL :: AP=0.7064 - Precision=0.8565 - Recall=0.8171 - Fscore=0.8364
640_1024_480 0.35/0.825 ALL :: AP=0.6938 - Precision=0.8571 - Recall=0.8015 - Fscore=0.8284
640_1024_485 0.35/0.825 ALL :: AP=0.6903 - Precision=0.8358 - Recall=0.8181 - Fscore=0.8269
640_1024_490 0.35/0.825 ALL :: AP=0.7216 - Precision=0.8653 - Recall=0.8269 - Fscore=0.8457
640_1024_495 0.35/0.825 ALL :: AP=0.6949 - Precision=0.8535 - Recall=0.8054 - Fscore=0.8288
640_1024_500 0.35/0.825 ALL :: AP=0.7022 - Precision=0.8380 - Recall=0.8295 - Fscore=0.8337
640_1024_505 0.35/0.825 ALL :: AP=0.7058 - Precision=0.8461 - Recall=0.8259 - Fscore=0.8359
640_1024_510 0.35/0.825 ALL :: AP=0.7020 - Precision=0.8359 - Recall=0.8302 - Fscore=0.8330
640_1024_515 0.35/0.825 ALL :: AP=0.7108 - Precision=0.8612 - Recall=0.8188 - Fscore=0.8394
640_1024_520 0.35/0.825 ALL :: AP=0.6952 - Precision=0.8337 - Recall=0.8250 - Fscore=0.8293
640_1024_525 0.35/0.825 ALL :: AP=0.6513 - Precision=0.7806 - Recall=0.8256 - Fscore=0.8025
640_1024_530 0.35/0.825 ALL :: AP=0.7150 - Precision=0.8620 - Recall=0.8227 - Fscore=0.8419
640_1024_535 0.35/0.825 ALL :: AP=0.7025 - Precision=0.8563 - Recall=0.8136 - Fscore=0.8344
640_1024_540 0.35/0.825 ALL :: AP=0.6968 - Precision=0.8436 - Recall=0.8207 - Fscore=0.8320
640_1024_545 0.35/0.825 ALL :: AP=0.7020 - Precision=0.8393 - Recall=0.8289 - Fscore=0.8340
640_1024_550 0.35/0.825 ALL :: AP=0.7073 - Precision=0.8600 - Recall=0.8152 - Fscore=0.8370
640_1024_555 0.35/0.825 ALL :: AP=0.7098 - Precision=0.8350 - Recall=0.8413 - Fscore=0.8381
640_1024_560 0.35/0.825 ALL :: AP=0.6873 - Precision=0.8209 - Recall=0.8279 - Fscore=0.8244
640_1024_565 0.35/0.825 ALL :: AP=0.7031 - Precision=0.8477 - Recall=0.8237 - Fscore=0.8355
640_1024_570 0.35/0.825 ALL :: AP=0.7034 - Precision=0.8444 - Recall=0.8240 - Fscore=0.8340
640_1024_575 0.35/0.825 ALL :: AP=0.6792 - Precision=0.8153 - Recall=0.8217 - Fscore=0.8185
640_1024_580 0.35/0.825 ALL :: AP=0.6882 - Precision=0.8304 - Recall=0.8217 - Fscore=0.8260
640_1024_585 0.35/0.825 ALL :: AP=0.6805 - Precision=0.8316 - Recall=0.8110 - Fscore=0.8211
640_1024_590 0.35/0.825 ALL :: AP=0.7039 - Precision=0.8354 - Recall=0.8321 - Fscore=0.8338
640_1024_595 0.35/0.825 ALL :: AP=0.7045 - Precision=0.8492 - Recall=0.8204 - Fscore=0.8345
640_1024_600 0.35/0.825 ALL :: AP=0.6956 - Precision=0.8333 - Recall=0.8259 - Fscore=0.8296
640_1024_605 0.35/0.825 ALL :: AP=0.7063 - Precision=0.8498 - Recall=0.8227 - Fscore=0.8360
640_1024_610 0.35/0.825 ALL :: AP=0.6922 - Precision=0.8296 - Recall=0.8250 - Fscore=0.8273
640_1024_615 0.35/0.825 ALL :: AP=0.6970 - Precision=0.8459 - Recall=0.8175 - Fscore=0.8314
640_1024_620 0.35/0.825 ALL :: AP=0.6971 - Precision=0.8452 - Recall=0.8152 - Fscore=0.8299
640_1024_625 0.35/0.825 ALL :: AP=0.7046 - Precision=0.8491 - Recall=0.8214 - Fscore=0.8350
640_1024_630 0.35/0.825 ALL :: AP=0.6983 - Precision=0.8487 - Recall=0.8139 - Fscore=0.8309
640_1024_635 0.35/0.825 ALL :: AP=0.6959 - Precision=0.8364 - Recall=0.8233 - Fscore=0.8298
640_1024_640 0.35/0.825 ALL :: AP=0.6992 - Precision=0.8524 - Recall=0.8129 - Fscore=0.8322
640_1024_645 0.35/0.825 ALL :: AP=0.6922 - Precision=0.8312 - Recall=0.8220 - Fscore=0.8266
640_1024_650 0.35/0.825 ALL :: AP=0.6848 - Precision=0.8387 - Recall=0.8087 - Fscore=0.8234
640_1024_655 0.35/0.825 ALL :: AP=0.6947 - Precision=0.8377 - Recall=0.8211 - Fscore=0.8293
640_1024_660 0.35/0.825 ALL :: AP=0.6893 - Precision=0.8338 - Recall=0.8175 - Fscore=0.8255

(顺便问一句,怎么在评论区放图片?)我们将尽快完成代码的检查和更新

GXYM commented 1 year ago

感谢作者的工作和代码,以下是个人的复现过程:

  • 复现的模型是:Res50-1s
  • 训练数据:CTW1500的1000张训练数据,未使用其他额外数据
  • 训练启动脚本:python3 -u train_textBPN_author.py --exp_name Ctw1500 --net resnet50 --scale 1 --max_epoch 660 --batch_size 12 --gpu 0 --input_size 640 --optim Adam --lr 0.001 --num_workers 8 --save_dir model/ctwr50_author
  • 测试集上模型批量测试的最佳结果:AP: 0.6850, recall: 0.8106, pred: 0.8360, FM: 0.8231
  • 论文中展示的结果:recall: 81.12, precision: 88.08, f-measure: 84.46

复现结果与论文结果差距比较大,请问是可能什么原因造成的,或者在复现过程中应该注意什么? ps:基本可以排除训练环境的问题,因为对主页提供的模型测试后与论文提供结果差异较小。

期待您的答复,祝好。

旧版本的训练代码中存在一些Bugs。我们已经更新了代码,更新后的代码可以稳定的训练,并且取得了比原来更好的复现结果,具体细节在这里有详细介绍,包括代码的变化情况及相应的影响程度,复现的细节和测试日志。

Recurrence details

Our replicated performance (without extra data for pre-training) by using the updated codes:

Datasets pre-training Model recall precision F-measure FPS
Total-Text - Res50-1s 84.22 91.88 87.88 13
CTW-1500 - Res50-1s 82.69 86.53 84.57 14
AlphaNext commented 1 year ago

感谢作者的工作和代码,以下是个人的复现过程:

  • 复现的模型是:Res50-1s
  • 训练数据:CTW1500的1000张训练数据,未使用其他额外数据
  • 训练启动脚本:python3 -u train_textBPN_author.py --exp_name Ctw1500 --net resnet50 --scale 1 --max_epoch 660 --batch_size 12 --gpu 0 --input_size 640 --optim Adam --lr 0.001 --num_workers 8 --save_dir model/ctwr50_author
  • 测试集上模型批量测试的最佳结果:AP: 0.6850, recall: 0.8106, pred: 0.8360, FM: 0.8231
  • 论文中展示的结果:recall: 81.12, precision: 88.08, f-measure: 84.46

复现结果与论文结果差距比较大,请问是可能什么原因造成的,或者在复现过程中应该注意什么? ps:基本可以排除训练环境的问题,因为对主页提供的模型测试后与论文提供结果差异较小。 期待您的答复,祝好。

旧版本的训练代码中存在一些Bugs。我们已经更新了代码,更新后的代码可以稳定的训练,并且取得了比原来更好的复现结果,具体细节在这里有详细介绍,包括代码的变化情况及相应的影响程度,复现的细节和测试日志。

Recurrence details

Our replicated performance (without extra data for pre-training) by using the updated codes:

Datasets pre-training Model recall precision F-measure FPS Total-Text - Res50-1s 84.22 91.88 87.88 13 CTW-1500 - Res50-1s 82.69 86.53 84.57 14

感谢作者的复现过程,下来我重新拉取代码进行尝试,非常感谢! ps:把图片直接拖拽到输入窗口栏就可以进行插入图片了。

AlphaNext commented 1 year ago

感谢作者的工作和代码,以下是个人的复现过程:

  • 复现的模型是:Res50-1s
  • 训练数据:CTW1500的1000张训练数据,未使用其他额外数据
  • 训练启动脚本:python3 -u train_textBPN_author.py --exp_name Ctw1500 --net resnet50 --scale 1 --max_epoch 660 --batch_size 12 --gpu 0 --input_size 640 --optim Adam --lr 0.001 --num_workers 8 --save_dir model/ctwr50_author
  • 测试集上模型批量测试的最佳结果:AP: 0.6850, recall: 0.8106, pred: 0.8360, FM: 0.8231
  • 论文中展示的结果:recall: 81.12, precision: 88.08, f-measure: 84.46

复现结果与论文结果差距比较大,请问是可能什么原因造成的,或者在复现过程中应该注意什么? ps:基本可以排除训练环境的问题,因为对主页提供的模型测试后与论文提供结果差异较小。 期待您的答复,祝好。

旧版本的训练代码中存在一些Bugs。我们已经更新了代码,更新后的代码可以稳定的训练,并且取得了比原来更好的复现结果,具体细节在这里有详细介绍,包括代码的变化情况及相应的影响程度,复现的细节和测试日志。

Recurrence details

Our replicated performance (without extra data for pre-training) by using the updated codes:

Datasets pre-training Model recall precision F-measure FPS Total-Text - Res50-1s 84.22 91.88 87.88 13 CTW-1500 - Res50-1s 82.69 86.53 84.57 14


感谢作者的更新和复现,这块直接测了链接中提供的复现模型,和公示的结果分别差了0.34%,0.5%,这个差距正常吗?具体流程如下:

结果

公示结果:recall: 83.77 | pred: 87.30 | FM: 85.50 自测结果:AP: 0.7383, recall: 0.8374, pred: 0.8723, FM: 0.8545

* [pretrain-- Totaltext-res50-1s](https://drive.google.com/file/d/11AtAA429JCha8AZLrp3xVURYZOcCC2s1/view?usp=sharing)

python3 -u eval_textBPN.py --net resnet50 --scale 1 --exp_name Totaltext --checkepoch 285 --test_size 640 1024 --dis_threshold 0.325 --cls_threshold 0.85 --gpu 0 --save_dir TextBPN-PP

结果:

公示结果:recall:85.34 | precison: 91.81 | FM: 88.46 自测结果: Config: tr: 0.7 - tp: 0.6 ALL :: Precision=0.9157 - Recall=0.8486 - Fscore=0.8809

GXYM commented 1 year ago

感谢作者的工作和代码,以下是个人的复现过程:

  • 复现的模型是:Res50-1s
  • 训练数据:CTW1500的1000张训练数据,未使用其他额外数据
  • 训练启动脚本:python3 -u train_textBPN_author.py --exp_name Ctw1500 --net resnet50 --scale 1 --max_epoch 660 --batch_size 12 --gpu 0 --input_size 640 --optim Adam --lr 0.001 --num_workers 8 --save_dir model/ctwr50_author
  • 测试集上模型批量测试的最佳结果:AP: 0.6850, recall: 0.8106, pred: 0.8360, FM: 0.8231
  • 论文中展示的结果:recall: 81.12, precision: 88.08, f-measure: 84.46

复现结果与论文结果差距比较大,请问是可能什么原因造成的,或者在复现过程中应该注意什么? ps:基本可以排除训练环境的问题,因为对主页提供的模型测试后与论文提供结果差异较小。 期待您的答复,祝好。

旧版本的训练代码中存在一些Bugs。我们已经更新了代码,更新后的代码可以稳定的训练,并且取得了比原来更好的复现结果,具体细节在这里有详细介绍,包括代码的变化情况及相应的影响程度,复现的细节和测试日志。

Recurrence details

Our replicated performance (without extra data for pre-training) by using the updated codes: Datasets pre-training Model recall precision F-measure FPS Total-Text - Res50-1s 84.22 91.88 87.88 13 CTW-1500 - Res50-1s 82.69 86.53 84.57 14

感谢作者的更新和复现,这块直接测了链接中提供的复现模型,和公示的结果分别差了0.34%,0.5%,这个差距正常吗?具体流程如下:

python3 eval_textBPN.py --net resnet50 --scale 1 --exp_name Ctw1500 --checkepoch 490 --test_size 640 1024 --dis_threshold 0.375 --cls_threshold 0.8 --gpu 0 --save_dir NoPre-tain
# 结果:
AP: 0.7162, recall: 0.8338, pred: 0.8510, FM: 0.8423
python3 eval_textBPN.py --net resnet50 --scale 1 --exp_name Totaltext --checkepoch 565 --test_size 640 1024 --dis_threshold 0.325 --cls_threshold 0.85 --gpu 0 --save_dir NoPre-tain
结果:
Config: tr: 0.7 - tp: 0.6
ALL :: Precision=0.9092 - Recall=0.8411 - Fscore=0.8738

为了排除测试集label的干扰,同样也测了作者提供的以下模型,CTW1500的差异很小,Totaltext差0.37%:

python3 eval_textBPN.py --net resnet50 --scale 1 --exp_name Ctw1500 --checkepoch 155 --test_size 640 1024 --dis_threshold 0.375 --cls_threshold 0.8 --gpu 0 --save_dir TextBPN-PP

# 结果
公示结果:recall: 83.77 | pred: 87.30 | FM: 85.50
自测结果:AP: 0.7383, recall: 0.8374, pred: 0.8723, FM: 0.8545
python3 -u eval_textBPN.py --net resnet50 --scale 1 --exp_name Totaltext --checkepoch 285 --test_size 640 1024 --dis_threshold 0.325 --cls_threshold 0.85 --gpu 0 --save_dir TextBPN-PP

# 结果:
公示结果:recall:85.34 | precison: 91.81 | FM: 88.46
自测结果:
Config: tr: 0.7 - tp: 0.6
ALL :: Precision=0.9157 - Recall=0.8486 - Fscore=0.8809

这个误差基本正常,不过你测试的参数似乎与我们提供的不一样;我不确定这是由于参数的不同导致的,还是显卡导致的,在我们的测试不同的显卡可能由于计算的精度不同,性能存在很小的差异;不过就你测试的结果来看,和我们提供的结果差异不大,不必过多纠结,我们的方法依然是有效的:

aamna2401 commented 1 year ago

I am struggling in reproducing results but still, I couldn't see how you did.

GXYM commented 1 year ago

I am struggling in reproducing results but still, I couldn't see how you did.

What problems have you encountered?