YCG09 / chinese_ocr

CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras
Apache License 2.0
2.75k stars 1.08k forks source link

vgg vs densenet #5

Closed pharrellyhy closed 6 years ago

pharrellyhy commented 6 years ago

Hi,之前的版本ctpn和crnn都是基于vgg的,你有测过densenet对比vgg的时间吗?还有你提供的准确率是怎么算出来的?是edit distance吗?

YCG09 commented 6 years ago

识别速度是crnn的10倍左右,准确率是val acc

pharrellyhy commented 6 years ago

我现在基于crnn的识别速度,ctpn crop出40个bounding boxes的话在600ms左右,我正在尝试把vgg换成resnet。不知道你说的快10倍能达到100ms以内吗?因为我单独测了vgg的速度也就几十ms,不知道换成resnet或densenet会有多少提高,但是准确率应该会提升。你是用了pretrained imagenet 的model然后finetune的么?还是全部重新训练的?谢谢!

On Tue, Apr 24, 2018 at 1:55 PM, Yang Chenguang notifications@github.com wrote:

识别速度是crnn的10倍左右,准确率是val acc

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/YCG09/chinese_ocr/issues/5#issuecomment-383812253, or mute the thread https://github.com/notifications/unsubscribe-auth/AP5IsXFfYovI0PZT4-wwMXpbw45PWYSZks5trr5HgaJpZM4TflkZ .

YCG09 commented 6 years ago

crnn我在TitanX上测试一个bounding box的耗时大约是50ms-60ms,densenet是6-8ms,相比crnn主要是去掉了两层blstm,rnn太慢了,我没有fine-tuning,完全从头训练的

pharrellyhy commented 6 years ago

是的,我也觉得blstm太慢了,不过对于英文单词或者词组的话可能还是会有帮助。你总共训练了几个epoch?

On Tue, Apr 24, 2018 at 4:24 PM, Yang Chenguang notifications@github.com wrote:

crnn我在TitanX上测试一个bounding box的耗时大约是50ms-60ms,densenet是6- 8ms,相比crnn主要是去掉了两层blstm,rnn太慢了,我没有fine-tuning,完全从头训练的

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/YCG09/chinese_ocr/issues/5#issuecomment-383847397, or mute the thread https://github.com/notifications/unsubscribe-auth/AP5IsehSTHztstH0H7HpX8t6nyn83kpTks5truFbgaJpZM4TflkZ .

YCG09 commented 6 years ago

没有rnn一般3个就够了,有rnn收敛慢,加2层blstm得5个左右val loss才能达到最优,当然这也跟learning rate有关

pharrellyhy commented 6 years ago

最近在想一个问题,去掉lstm会不会有问题?因为lstm可以处理不定长的输入,如果去掉了就只能固定输入的size了吧?对lstm一直理解的不够透彻,不知道会有什么影响。单看结果好像还OK

YCG09 commented 6 years ago

cnn输出给lstm的张量尺寸是固定的

pharrellyhy commented 6 years ago

没错。你应该也是把不同长度的input补齐了是吧,没细看你的代码

On Thu, May 10, 2018 at 3:58 PM, Yang Chenguang notifications@github.com wrote:

cnn输出给lstm的张量尺寸是固定的

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/YCG09/chinese_ocr/issues/5#issuecomment-387981833, or mute the thread https://github.com/notifications/unsubscribe-auth/AP5IsdfhFHYwNSwDPmAfdrmsaMQ5vYpBks5tw_M9gaJpZM4TflkZ .