Closed ghhong1986 closed 6 years ago
The cnn is applied to the entire image, including the padding. The CTC layer indicates the sequence length; as a result I am not entirely certain whether the gradient for the padded region is included in the loss function. Even if it is, it should not contribute much because it should not change the loss (due to the sequence length restriction).
I am not sure what you are asking. The CTC layers ask for the sequence length precisely so it does not utilize irrelevant logits that arise due to padding.
This calculation captures how the downsizing and padding changes the length of the original, unpadded image width so that you know which timesteps are valid in the final 1D sequence. I make the calculation step-by-step so it is easier to identify (and verify) what the image width is after each layer's transformation.
hi weinman , I have read the paper and code and try to understand but a few questions confused me, please help me.
bucket_by_sequence_length
with paramter dynamic_pad setted True , in every batch has a fix shape , but different batch may have different shape, so how does cnn in the model work ?