Closed hxk11111 closed 5 years ago
Hi,
the blank is only used to indicate the end of the resulting string of the CTC decoder (if it is shorter than the output of the RNN layers). So, it would e.g. return "Hello-----", where only the string before the first blank is relevant.
P.S.: the output of your greedy decoder should not contain blanks between characters. Seems that it only applies step (1) of greedy decoding: computing the list of characters with highest score along the x-axis of the image (more details see "Best path decoding" in this article).
Many Thanks
Hi @githubharald , thanks for you project. I have some question about the mat fed into the tf session. I am training crnn+ctc model. For example, for an image which represents for text "x181208022". Before ctc layer, I have the rnn output, if I use greedy decoding, I will get the result as "--x-11-8-1-2-0-8--0-2-2---", "-" represents for the ctc-blank. If I want to use your project, should I just feed the rnn output matrix into word beam search part? Because I saw your testing code:
The for loop will break if met a ctc-blank. But in my case, ctc-blank is not the end of a word, if break it will give the wrong result