Bartzi / stn-ocr

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition
https://arxiv.org/abs/1707.08831
GNU General Public License v3.0
498 stars 139 forks source link

question about the N grids in paper #27

Closed caoyangcr7 closed 5 years ago

caoyangcr7 commented 5 years ago

Hello, I have a question about the N grids in the paper. in the paper, it said that

The first is the localization network that takes the input image and predicts N transformation matrices, that are applied to N identical grids, forming N different sampling grids

How can we know the number of N ?

Bartzi commented 5 years ago

N is the number of words or characters that you want to recognize.

caoyangcr7 commented 5 years ago

thanks for your reply, I've got it. @Bartzi

caoyangcr7 commented 5 years ago

@Bartzi Sorry to bother you again, I have another 2 questions. first is still about the N, because different training images may have different length of words or characters, so will N change during trainning? When I saw the source code, I found that N was set by num_time_steps param. if N keeps the same during training, so what should we do if N is larger than the length of words or charaters? the second question is about the recognition network,When we get N text regions from the original images after the sample network, how could we find the corresponding label for different text regions during training?for example, we get 2 text regions '16', '18', and we have 2 labels '16', '18',how can we choose label ‘16’ for text regions '16' instead of '18' during the network training? Wish your reply, Thanks.