Open Cescfangs opened 3 years ago
@Cescfangs yes
@Cescfangs yes
Thanks for the reply, and I'm curious about the improvement of this mWER tuning, say 5% relative wer reduction?
Also, I am a little confused about the “mbr” loss, the inputs are not used in backward function, how does the grad flow to model params?
https://github.com/hirofumi0810/neural_sp/blob/2b10b9cc4bdecb5180ecc45575c0ef410fb09aa3/neural_sp/models/seq2seq/decoders/las.py#L535-L548 I don't know much about mbr, according to these lines, it looks like a mWER loss and gradient to me