mravanelli / pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
2.37k stars 446 forks source link

online speech recognition #56

Closed Johe-cqu closed 5 years ago

Johe-cqu commented 5 years ago

Hi,I have completed the training of the acoustic model. How can I use it for online speech recognition?look forward to your kind advice.

Rpersie commented 5 years ago

The trained model is offline so it can not directly be used for online speech recognition. it requires the models to be trained with online CMN or something like that. Then use the kaldi-online decoder. You can find some online issues on kaldi https://github.com/kaldi-asr/kaldi/issues/2801

mravanelli commented 5 years ago

The current version natively supports off-line speech recognition only. As outlined by @Rpersie working on on-line ASR requires different normalization and decoding strategies. We plan to support on-line ASR in one of the future version of the toolkit.

https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail Virus-free. www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Sun, 24 Feb 2019 at 22:38, Rpersie notifications@github.com wrote:

The trained model is offline so it can not directly be used for online speech recognition. it requires the models to be trained with online CMN or something like that. Then use the kaldi-online decoder. You can find some online issues on kaldi kaldi-asr/kaldi#2801 https://github.com/kaldi-asr/kaldi/issues/2801

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/56#issuecomment-466861103, or mute the thread https://github.com/notifications/unsubscribe-auth/AQGs1iqs-QMhuXwanfY7m67Mri35sm_qks5vQ1qxgaJpZM4bPAmE .

Johe-cqu commented 5 years ago

thank you! Looking forward to your update!

Johe-cqu commented 5 years ago

@Rpersie thank you!

Johe-cqu commented 5 years ago

@mravanelli @Rpersie By the way, Why on-line ASR requires different normalization and decoding strategies? It requires faster decoding speed?

Rpersie commented 5 years ago

As the offline cmvn is the mean and var of all frames,in the online mode you can not get all the frames. If you just do the online-cmvn, this will bring the mismatch between the training and test. As for decoding, the feats are loaded segments by segments in the online mode. So we should memory the wfst state of the previous segments. When the next segment is loaded, we can follow the wfst token states of previous. So the usual decoder can not be used and need online decoder.

mravanelli commented 5 years ago

Yes, as outlined by @Rpersie the systems are different. In online ASR we start the decoding while recording the signal itself and we thus don't have access to the future. We will deal with it the the future!

On Tue, 26 Feb 2019 at 03:29, Rpersie notifications@github.com wrote:

As the offline cmvn is the mean and var of all frames,in the online mode you can not get all the frames. If you just do the online-cmvn, this will bring the mismatch between the training and test. As for decoding, the feats are loaded segments by segments in the online mode. So we should memory the wfst state of the previous segments. When the next segment is loaded, we can follow the wfst token states of previous. So the usual decoder can not be used and need online decoder.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/56#issuecomment-467344506, or mute the thread https://github.com/notifications/unsubscribe-auth/AQGs1jy2bdZhlptXvjFd1C3c0vEXubNeks5vRPBngaJpZM4bPAmE .