watsonyanghx / CNN_LSTM_CTC_Tensorflow

CNN+LSTM+CTC based OCR implemented using tensorflow.
MIT License
362 stars 210 forks source link

can this algorithm deal with dynamic length characters? #15

Open dighexode opened 6 years ago

dighexode commented 6 years ago

the image I made I run this code successfully, including both train set and validation set. Then I changed one of the validation image to add 2 characters, previously it is '7+0 9', I changed it to '7+0 9+7'. But it was recognized as '7+(0 * 9)'. The '+7' font style is same with it in this image, I copied to add it, so it is not font style issue. I attached the image I made. Please take a look. Can you tell me why?

martinbacsi commented 6 years ago

maybe the width of the image is greater than the width was given when the model was train, and the added characters are cropped off

prolaser commented 6 years ago

Hey guys

Have any ideas how can we apply this network on images with different sizes. I have a data set which varies in image Length. Any one has any ideas please share. Thanks

anubhavrohatgi commented 6 years ago

probably we can find the max dimensions in your dataset and accordingly pad the smaller images with zeros to accommodate for the big size.

prolaser commented 6 years ago

Hi anubhavrohatgi

Thanks for the quick reply. By the way i am trying to use IAM offline Handwritten data set for this matter. You have made a good point here. There is a good chance that padding smaller images would help in this case. If you have any other ideas i would be happy to hear them. I will try the padding and if it works fine ill update you guys about it.

zzks commented 6 years ago

because this model cannot deal with OOV, you need to add something like '7+0 * 9+7' to fine tuning your model. Regarding to dynamic length image, I have not fixed it yet, any help would be highly appreciated!