How would include-log-softmax affect chain model performance

housebaby commented 3 years ago

Hello @danpovey It is obeserved from chain model output that the "output probability" is kind of flat, compared to CTC sharp peak . I wonder how include-log-softmax=true would affect model performance , and why it is not used in the default config. If we add softmax , is it possible to improve the decoding speed, as the output probability will be more discriminative so that more unlikely paths will be pruned? Thank you.

danpovey commented 3 years ago

Hello @danpovey https://github.com/danpovey It is obeserved from chain model output that the "output probability" is kind of flat, compared to CTC sharp peak .

Yes, this is expected...

I wonder how include-log-softmax would affect model performance , and why it is not used in the default config.

This wouldn't make a difference as the objective function is invariant to it. Dan

If we add softmax , is it possible to improve the decoding speed, as the output probability will be more discriminative so that more unlikely paths will be pruned? Thank you.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4316, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO3XOZVKSAFFFVNSKSDSNYVJPANCNFSM4TG66ITQ .

housebaby commented 3 years ago

This wouldn't make a difference as the objective function is invariant to it. Dan

If include-log-softmax=true won't make any difference, then which part can be tuned to make the probability more discrimitive in chain model? Thank you very much.

danpovey commented 3 years ago

"discriminative" is not well defined here.. it depends what your purpose is.

On Mon, Nov 2, 2020 at 12:12 PM housebaby notifications@github.com wrote:

This wouldn't make a difference as the objective function is invariant to it. Dan

If include-log-softmax=true won't make any difference, then which part can be tuned to make the probability more discrimitive in chain model? Thank you very much.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4316#issuecomment-720225802, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO2ZWORUFARR22SS7ELSNYWRDANCNFSM4TG66ITQ .

housebaby commented 3 years ago

Sorry for the ambiguity. I mean, is it possible to make output probability in chain more sharp and less flat ( like CTC) , so that more unlikely paths will be pruned in decoding, thus making the decoding process faster? I wonder whether it is reasonable to think in this way

"discriminative" is not well defined here.. it depends what your purpose is. … On Mon, Nov 2, 2020 at 12:12 PM housebaby @.***> wrote: This wouldn't make a difference as the objective function is invariant to it. Dan If include-log-softmax=true won't make any difference, then which part can be tuned to make the probability more discrimitive in chain model? Thank you very much. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4316 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO2ZWORUFARR22SS7ELSNYWRDANCNFSM4TG66ITQ .

danpovey commented 3 years ago

It would be necessary to train with CTC as part of the objective. Not easy with current code, partly because of the interaction with context-dependency which makes it a little non-trivial to do it in a sensible way.

Dan

On Mon, Nov 2, 2020 at 1:21 PM housebaby notifications@github.com wrote:

Sorry for the ambiguity. I mean, is it possible to make output probability in chain more sharp and less flat ( like CTC) , so that more unlikely paths will be pruned in decoding, thus making the decoding process faster? I wonder whether it is reasonable to think in this way

"discriminative" is not well defined here.. it depends what your purpose is. … <#m-3348007015380785837> On Mon, Nov 2, 2020 at 12:12 PM housebaby @.***> wrote: This wouldn't make a difference as the objective function is invariant to it. Dan If include-log-softmax=true won't make any difference, then which part can be tuned to make the probability more discrimitive in chain model? Thank you very much. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4316 (comment) https://github.com/kaldi-asr/kaldi/issues/4316#issuecomment-720225802>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO2ZWORUFARR22SS7ELSNYWRDANCNFSM4TG66ITQ .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4316#issuecomment-720242255, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO3PAXXGZCLJVK4K5LTSNY6TZANCNFSM4TG66ITQ .

kkm000 commented 3 years ago

I am closing this issue. As it stands, it's not easily doable.

kaldi-asr / kaldi

How would include-log-softmax affect chain model performance #4316