Closed himanshk96 closed 5 years ago
Is your input file 16 kHz mono? You can use soxi
to display the audio format.
I converted my file to 16000hz. Thank you so much, Works like a charm. I Didnt read anywhere that it requires 16khz bit rate wav file. Anyways! its solved
I am working on academic domain. Do we have guides to add vocabulary to pretrained model instead of retraining it whole?
I don't know such guides, but this section might be helpful: http://kaldi-asr.org/doc/online_decoding.html#online_decoding_nnet2_vocab .
But you will probably need some files that led to the pretrained model and are not contained in the pretrained model. (?)
you can follow the model adaptation section in our README which does allow for adaptation to a custom dict:
Thanks for the pointer.
@gooofy What happens if a new word in the adaptation lexicon is not in the language model (lm.arpa) that is reused? I am asking because I cannot get new words to be recognized. (Should I put this discussion into a new issue?)
I guess you will have to rebuild the language model first in that case - should be a fairly quick process. You can use either srilm or kenlm for that task.
I used kenlm and it worked. Thanks!
cool - thanks for the feedback! :)
Hello. Sorry to comment on closed ticket, but i have same problem only with microphone input. Do i have to somehow change its input frequency?
the models expect 16 bit 16KHz mono audio input - if your recording setup produces anything other than that, you will have to either change your setup or use a converter.
I have been following the Github Quickstart link which converts 4 demo wavs files to text. It works fine, but now when I use my own Wav file it throughs an error as below:
Traceback (most recent call last): File "kaldi_decode_wav.py", line 72, in <module> if decoder.decode_wav_file(wavfile): File "kaldiasr/nnet3.pyx", line 207, in kaldiasr.nnet3.KaldiNNet3OnlineDecoder.decode_wav_file (kaldiasr/nnet3.cpp:4726) File "kaldiasr/nnet3.pyx", line 170, in kaldiasr.nnet3.KaldiNNet3OnlineDecoder.decode (kaldiasr/nnet3.cpp:3968)
RuntimeError`The file I am using is a vimeo video converted to wav using youtube-dl. get the wav file using this command
youtube-dl --extract-audio --audio-format wav https://vimeo.com/73643788
I give this file as input to the kaldi_decode_wav.py
Can anyone help me what thing I am doing wrong?