ilovin / lstm_ctc_ocr

Use CTC + tensorflow to OCR
https://ilovin.github.io/2017-04-06/tensorflow-lstm-ctc-ocr/
354 stars 140 forks source link

will this work with dual line text? #15

Closed gewenpulan closed 6 years ago

gewenpulan commented 7 years ago

作者您好,我刚刚开始研究深度学习的ocr部分,看到这个识别验证码已经很好了,想问一下,这个模型对于多行文字识别效果怎么样?比如一张图片有三行文字需要识别,可以直接套用这个模型然后标注数据把三行文字顺序标注出或者是修改模型才能识别?

Thummpy commented 7 years ago

for dual-line text you could pre-pre-process to split the lines

ilovin commented 7 years ago

it may not. I feed every col to the lstm. if there exits multi-line, the network may be confused. However, "attention" may be a solution. PS:There is a paper called "STN", it uses mesh grid to solve the problem.

ilovin commented 6 years ago

if you have other questions, feel free to re-open