Closed miqbal23 closed 4 years ago
The neural network has output neurons for the characters "a", "b", ... and also for the blank, if CTC loss is used. Of course, the order of the neurons does matter, you can't train the neural network to predict an "a" for neuron 0, and then suddenly use its output as prediction for blank. This is what you did in your experiment, which of course does not make sense. Usually, the deep learning framework defines the index of the blank neuron, e.g. TF 1 had it at the last index (this is why I'm using this convention in this repo), in TF 2 they changed it to be 0 by default.
Hi, I have a question regarding your beam search implementation.
On your ctcBeamSearch method, you put value on
blankIdx
equals to length of the classes (in this case, the known letters and symbols). But on some other beam-search implementation, they put zero on it.I tested this using your example, and indeed it differs both in decoded result and how far is it from the ground truth (i'm using CER and WER)
So is there a different case where the
blankIdx
is not zero? Which value is suitable for beam search decoding?