pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
How can word-level instead of phoneme-level speech recognition be done with the TIMIT dataset?
I build and train models. On the other hand, I have only phoneme transcription. I want word transcription of audio files. Would you help me?
How can word-level instead of phoneme-level speech recognition be done with the TIMIT dataset? I build and train models. On the other hand, I have only phoneme transcription. I want word transcription of audio files. Would you help me?