Sanster / tf_ctpn

Tensorflow CTPN
MIT License
39 stars 16 forks source link

ICDAR #1

Open Sanster opened 6 years ago

Sanster commented 6 years ago

ICDAR13

Net Dataset Recall Precision Hmean
Origin CTPN ICDAR13 + ? 73.72% 92.77% 82.15%
vgg16 6000, H flipped 63.69% 71.46% 67.35%
vgg16 new_bilstm cleaned, latin 2765 + icdar13(229) 64.64% 83.32% 72.81%
resnet101 6000 66.47% 73.03% 69.59%
vgg16 cleaned, latin 2765 + icdar13(229) 63.25% 79.55% 70.47%
res101 cleaned, latin 2765 + icdar13(229) 63.87% 76.58% 69.65%
res101 new_bilstm cleaned, latin 2765 + icdar13(229) 66.68% 77.53% 71.70%
Net Dataset Recall Precision Hmean
Origin CTPN ICDAR13 + ? 73.72% 92.77% 82.15%
vgg16 without commit https://github.com/Sanster/tf_ctpn/commit/dc533e030e5431212c1d4dbca0bcd7e594a8a368 and https://github.com/Sanster/tf_ctpn/commit/7ae3d50d72bbdccb16f00987a5edb97659d6fbf2 data provided by @eragonruan 63.69% 71.46 % 67.35%
vgg16 with commit https://github.com/Sanster/tf_ctpn/commit/dc533e030e5431212c1d4dbca0bcd7e594a8a368 without commit https://github.com/Sanster/tf_ctpn/commit/7ae3d50d72bbdccb16f00987a5edb97659d6fbf2 data provided by @eragonruan 69.70% 70.10% 69.90%
vgg16 MLT17 latin/chn new ground truth + icdar13 training data 74.26% 82.46% 78.15%
interxuxing commented 6 years ago

thank you for your comprehensive experimental results. Why the results from res101 is worse than those from vgg?

Sanster commented 6 years ago

I used same hyper params for training vgg16 and res101, maybe it's not suitable for res101, or maybe res101 needs more steps.

Nic-Ma commented 5 years ago

Hi @Sanster 你这里的训练参数是: Train step: 80k lr: 0.00001 但你的代码中是0.001和40k,请问到底哪个值可以得到更加理想的结果? Thanks.

Sanster commented 5 years ago

@toxic2m 上面的数据用的是 Train step: 80k lr: 0.00001 练出来的

Nic-Ma commented 5 years ago

@Sanster 谢谢! 我是在Caffe上训练的,用lr=0.001, step=40K, iter_size=10可以得到能用的模型。lr=000001无法收敛。 另外我想请教一下你的训练集的ground truth是用哪个程序画出来的? 我想画出来看看现在数据集的label框情况。 Thanks.

Sanster commented 5 years ago

@toxic2m 我是自己写了个工具进行处理的:https://github.com/Sanster/AnyLabel

Nic-Ma commented 5 years ago

@Sanster 谢谢!我看看你这个React项目怎么跑起来。 另外我是NVIDIA上海这边Nic,可以邮箱或者微信详细沟通一下CTPN训练问题吗? 我邮箱nma@nvidia.com Thanks.

Nic-Ma commented 5 years ago

Hi @Sanster 我对比用你的数据集训练的模型和CTPN原作者的模型,我的模型对小字的检测效果比原作者模型差很多,是不是你的数据集中小字太少了? 你最近更新了数据集吗? Thanks.

Sanster commented 5 years ago

@toxic2m 小字确实比较少,split minAreaRect 的时候过滤掉了一部分,因为我的应用场景里面小字不多。另外不知道你有没有做 side refine,我记得作者的论文里提到 side refine 对小字有提升

不要意思哈,不太用邮箱,一直没回复你

Nic-Ma commented 5 years ago

@Sanster

  1. 非常感谢你的分享,我现在想自己整理一下数据集,多保留一些小字,并且矫正一下位置。 可以用TensorFlow CTPN那位作者提供的数据集在你的React app里操作吗? 可行的话我想办法把你的React程序跑起来手动过滤一遍。
  2. side refine的话我找不到带side refine的数据集啊,这个训练集要有label吧?
  3. 方便电话或者微信简单聊一下吗?18221892546

Nic NVIDIA

Sanster commented 5 years ago

@toxic2m 加你微信啦。

  1. 可以跑起来,都是 voc 格式的,能够在 app 里面进行 box 的删除操作
  2. side refine 我理解不用再标注了,训练的时候要跑 text_connector 获得 text bounding box,再和 gt 最左最右的小框算 loss
edis219 commented 5 years ago

@Sanster 请问有mobilenet v2的测试数据么?

hrfng commented 5 years ago

Hi @Sanster, I evaluate the model(vgg16_latin_chn_newdata.zip) you provide on the icdar 2013 test, the icdar2013 metric is {"recall": 0.742648401826484, "precision": 0.7948872180451129, "hmean": 0.7678803853179051} which is below 1.5 point with your result, do I miss some thing?

Test Environment:

yaya325 commented 5 years ago

你好,请问作者原来的模型的测试结果你是自己测试得到的,还是直接引用的作者的结果?我下载的作者的模型,用caffe测试结果并没有作者论文中写的这么好呢?fscore只有0.75左右。