Question about eesen training code

srvk / eesen

The official repository of the Eesen project

http://arxiv.org/abs/1507.08240

Apache License 2.0

824 stars 343 forks source link

Question about eesen training code #29

Closed SFuji closed 8 years ago

SFuji commented 8 years ago

Hi ,

There is a question for me while reading the eesen code. Location: line 66 to line 75 at file net/ctc_loss.cc When back-propagate the errors through the softmax layer, as I can get from the code, the formula is ctc_error * yk - Row_mul(yk * ColSum(ctc_error * yk) ). But the formula of softmax-derivation is yk * (1 - yk). So as I can get, the difference of using softmax-derivation formular is the ColSum and RowMul. And why? Is there something I missed?

Looking forward for reply!

yajiemiao commented 8 years ago

formula

Here are the derivation for the gradients, from after-softmax to before-softmax

yajiemiao commented 8 years ago

lines 66~75 of net/ctc_loss.cc exactly follow Equation 8(c) in the attached image.

SFuji commented 8 years ago

I have got the point. Thanks.