Closed xinli94 closed 5 years ago
The pooling operations reduce the size of the features horizontally, so it turns out that to recognize a single character requires you start with at least 8 pixels in width (10 pixels for two characters). So that's a hard limit on the input data.
You could pad with zeros, but the results may not be very good, since there will be strong filter responses at the edges.
Got it. Thanks so much!
Hi,
When I created tfrecords for my custom dataset, a lot of images got filtered out. Because the input image only contains one character, so precessed image width <
min_width
(https://github.com/weinman/cnn_lstm_ctc_ocr/blob/master/src/mjsynth-tfrecord.py#L143).I am wondering what is the correct way to deal with single char inputs. Do I need to set
min_width
to be a smaller value (already tried 3, still filtered out many images), or should I pad the input image with zeros?Thanks, Xin