mpezeshki / CTC-Connectionist-Temporal-Classification

Theano implementation of CTC.
Apache License 2.0
74 stars 26 forks source link

What is the function class_batch_to_labeling_batch(y, y_hat, y_hat_mask=None) mean in ctc_cost.py? #6

Open star013 opened 9 years ago

star013 commented 9 years ago

Hello, I am doing some research on TIMIT and I have to use CTC in my model. I read ctc_cost.py but I can not understand the function: class_batch_to_labeling_batch(y, y_hat, y_hat_mask=None). In comments, y_hat is T x B x (C+1) matrix and y_hat_mask is T x B matrix. In line 65: y_hat = y_hat * y_hat_mask.dimshuffle(0, 'x', 1) I am puzzled because y_hat_mask.dimshuffle(0, 'x', 1) is T x 1 x B matrix and it can not multiply with y_hat which is T x B x (C+1) matrix. In addition, I tried to run this function in Ipython notebook and it reported an error. Could you please explain why it is y_hat = y_hat * y_hat_mask.dimshuffle(0, 'x', 1) and what is res in the function? Thanks.

mpezeshki commented 9 years ago

Hi, Let's say we have 3 classes (a, b, and c), batch-size of 1, and 2 time-steps. So the probabilities are:

time |   a   |   b   |   c   |   blank    |
-------------------------------------------
  0  |  0.2  |  0.4  |  0.3  |    0.1     |
-------------------------------------------
  1  |  0.35 |  0.15 |  0.2  |    0.3     |
-------------------------------------------

Let's suppose the output sequence is b, a, a, b, c. (Actually blanks will be added too but at this point let's just ignore it.) Then the output must be:

time |   b   |   a   |   a   |   b    |   c     
----------------------------------------------
  0  |  0.4  |  0.2  |  0.2  |  0.4   |  0.3  
----------------------------------------------
  1  |  0.15 |  0.35 |  0.35 |  0.15  |  0.2
----------------------------------------------

So basically it replicates those probabilities to the length of the output sequence. About the mask, I don't remember why did I do this, but now Kyle has a new version of code which you may find useful: https://gist.github.com/kastnerkyle/ca851e39229551208c0d#file-minibatch_ocr-py-L175