Closed odgiv closed 4 years ago
Yes that's right.
downsampling factor = 4 means that the size of the feature map is reduced four times due to two 2x2 maxpools.
And 2 means the first 2 discarded RNN output timesteps since first couple outputs of the RNN tend to be garbage.
input = (batch, 128, 64, 1) RNN output = (batch, 32, Class) # 128 / 4 = 32 CTC input = (batch, 30, Class) # 32 - 2 = 30
It is good to refer to this site, and I personally recommend that you understand CTC.
Super thank you
Hi, you have done a great job by the way. I am trying to understand implementation of the model. I have a question regarding a line number 55 in Image_Generator.py.
input_length = np.ones((self.batch_size, 1)) * (self.img_w // self.downsample_factor - 2)
Am I right that
img_w
is downsampled bydownsampling_factor
due to size and number of maxpooling is applied? What I don't also get is why you substract 2 from it again?