Open ghost opened 4 years ago
Hi, in fact, it only depends on the output lstm T length.
thanks for your reply. if we don't change the network, will the output lstm T length only depend on the width of the image?
I got the answer from your reply from https://github.com/Holmeyoung/crnn-pytorch/issues/17
You need to calculate it. After conv and pool what's the image width. The image width will be T length in rnn.
then output width of the CRNN().cnn will be the T length? and text length should not exceed T? is what I said here right? thank you so much.
self.cnn = cnn
self.rnn = nn.Sequential(
BidirectionalLSTM(512, nh, nh),
BidirectionalLSTM(nh, nh, nclass))
I got the answer from your reply from #17
You need to calculate it. After conv and pool what's the image width. The image width will be T length in rnn.
then output width of the CRNN().cnn will be the T length? and text length should not exceed T? is what I said here right? thank you so much.
self.cnn = cnn self.rnn = nn.Sequential( BidirectionalLSTM(512, nh, nh), BidirectionalLSTM(nh, nh, nclass))
Yeah
Got it, thank you so much.
@Holmeyoung hi, the output width of the CRNN().cnn is T, and text length should not exceed T. my question is: if the text length is larger than T, then there will be errors? or we can still train the model?
thank you so much.
@Holmeyoung in https://github.com/Holmeyoung/crnn-pytorch/issues/17 you mentioned that your codes only support training with text length <= 26, I found that (1) when resize the images to 100X32. length of the raw character output is 26. so we cannot train with text length > 26.
(2) when keep_ratio = True, only the height of the image is resized to 32, the width of the image is not fixed and vary for different images. so length of the raw character output is not fixed and depends on the width of the image, maybe we can train with any text length
conclusion: we can train with any text length when we set keep_ratio = True during training
Thank you so much.