Hi. I ran this project for korean speech recognition.
But loss is not decreasing and i don't get good predictions.
I've already used preprocessing method that works well on DeepSpeech and LAS.
It's seems like to DeepSpeech architectures. but not, in the paper use hmm pre-builded on kaldi processing and lf-mmi instead of CTC.
Hi. I ran this project for korean speech recognition. But loss is not decreasing and i don't get good predictions. I've already used preprocessing method that works well on DeepSpeech and LAS.
It's seems like to DeepSpeech architectures. but not, in the paper use hmm pre-builded on kaldi processing and lf-mmi instead of CTC.
https://www.danielpovey.com/files/2020_interspeech_multistream.pdf
above, like this project reference, using single stream before multi stream. who knows problems or gets good performance using this project?