MaybeShewill-CV / CRNN_Tensorflow

Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition
MIT License
1.03k stars 388 forks source link

迭代速度非常慢 #130

Closed lkj1114889770 closed 6 years ago

lkj1114889770 commented 6 years ago

你好,我用你的模型跑的时候,在我自己的数据集上训练,batch size还是32,但是迭代速度却出奇地慢,GPU是k40 12G,一个epoch迭代却差不多要一分钟,可能是什么原因啊

MaybeShewill-CV commented 6 years ago

@lkj1114889770 先检查一下你的gpu utilization是不是比较低

lkj1114889770 commented 6 years ago

@MaybeShewill-CV 配置使用率的部分我直接删掉了,使用默认全部占满的那种方式

MaybeShewill-CV commented 6 years ago

@lkj1114889770 是运行的时候观察gpu使用率==!

lkj1114889770 commented 6 years ago

@MaybeShewill-CV Nvida-smi看了,就是全部占满了, NVIDIA-SMI 367.57 Driver Version: 367.57 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla K40m Off | 0000:83:00.0 Off | 0 | | N/A 40C P0 62W / 235W | 10964MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 14556 C python 10960MiB |

batch size改小了点,16,但是速度还是很慢,15s迭代一次

lkj1114889770 commented 6 years ago

@MaybeShewill-CV 另外还想请问一下,你的中文识别chinese数据集用的是什么啊,一般的有中文的场景识别的数据集感觉数据量根本不够用啊,中文版本的代码和master下的代码有大的改变吗,还是仅仅输入的训练数据集不一样?

MaybeShewill-CV commented 6 years ago

@lkj1114889770 建议修改data feed pipline 大多数时间都消耗在数据io上了

lkj1114889770 commented 6 years ago

@MaybeShewill-CV 噢,我试试,那这个问题呢:你是用什么数据集来训练中文的OCR呢,中文版本的代码和master下的代码有大的改变吗,还是仅仅输入的训练数据集不一样?谢谢

jxlxt commented 6 years ago

@MaybeShewill-CV 请问怎么样修改data feed pipeline, 有没有什么具体的操作方法😂, 另外我在训练1w4千多张、1800多类的中文OCR识别时,cost到30多就无法继续下降了,请问有什么好建议吗? 谢谢

lkj1114889770 commented 6 years ago

@jxlxt 你的速度有那么慢吗,我现在训练速度太慢了

jxlxt commented 6 years ago

@lkj1114889770 嗯嗯,我也是K40,然后每一个epoch也非常慢。。。之前英文的很快,中文的就会很慢很慢,20-30s一个epoch。我之前是以为label数量太多(从几十个一下子到一千多个)导致速度变慢,今天发现你也有这样的问题😂。我也想过GPU util为0这个问题,不过我也没什么思路去解决

lkj1114889770 commented 6 years ago

@jxlxt 加个QQ一起交流一下吧,我的ID的数字就是我的QQ

lkj1114889770 commented 6 years ago

找到了问题出处,是ctc_beam_search比较消耗时间,因为是用来看精度,其实不需要每个epoch都跑的,改成100次看一下速度提升就非常快了,比如像这样:

` for epoch in range(trainepochs): , c = sess.run( [optimizer, cost])

        if epoch % 100 == 0:
            _, c, seq_distance, preds, gt_labels, summary = sess.run(
                [optimizer, cost, sequence_dist, decoded, input_labels, merge_summary_op])
            accuracy = calculate_accuracy(decoder,preds,gt_labels)
            logger.info('**********************************************************************')
            logger.info('Epoch: {:d} cost= {:9f} seq distance= {:9f} train accuracy= {:9f}'.format(
                epoch, c, seq_distance, accuracy))
            logger.info('**********************************************************************')
            summary_writer.add_summary(summary=summary, global_step=epoch)
            saver.save(sess=sess, save_path=model_save_path, global_step=epoch)

        elif epoch % 10 == 0:
            logger.info('Epoch: {:d} cost= {:9f} '.format(epoch, c))`

另外,因为是基于Lexicon-free无字典的方式,greed search的方法应该就适用了吧,而且计算没那么慢 @MaybeShewill-CV

MaybeShewill-CV commented 6 years ago

@lkj1114889770 ctc beam search是用来测试精度的 不需要每个epoch都跑

mdbenito commented 6 years ago

See also #16 and #71 for the same question