daanzu / kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
GNU Affero General Public License v3.0
332 stars 49 forks source link

Speech2Text, Cant seem to make it work. #74

Closed fastandcuriousX closed 1 year ago

fastandcuriousX commented 1 year ago

Hello everyone, recently I've been playing around with kaldi.

And I'm trying to output text from an audio file, I'm using this for my test.

I've downloaded https://github.com/daanzu/kaldi-active-grammar/releases/tag/v1.8.0 kaldi-dragonfly-winpython37.zip Audio https://github.com/daanzu/kaldi-active-grammar/blob/master/examples/test.wav Script https://github.com/daanzu/kaldi-active-grammar/blob/master/examples/plain_dictation.py

It's all working fine, but once I change the content of the "kaldi_model" folder with kaldi_modeldaanzu*-biglm.zip It stops working.

I've tried downloading kaldi_active_grammar-3.0.0-py2.py3-none-win_amd64.whl and kaldi_model_daanzu_20211030-biglm.zip in the v3.0.0, extract it in the same folder, it spits out the same error as before. Here's my folder structure :

Directory of D:\Kaldi

DIR - kaldi_active_grammar DIR - kaldi_active_grammar-3.0.0.dist-info DIR - kaldi_model test.py (https://github.com/daanzu/kaldi-active-grammar/blob/master/examples/plain_dictation.py) test.wav (https://github.com/daanzu/kaldi-active-grammar/blob/master/examples/test.wav)

Here's the error I'm facing :

Kaldi-Active-Grammar v3.0.0: If this free, open source engine is valuable to you, please consider donating https://github.com/daanzu/kaldi-active-grammar Disable message by calling kaldi_active_grammar.disable_donation_message() -win]:fst::ActiveGrammarFstPreparer::Prepare():decoder\active-grammar-fst.cc:631) Added 2 new states while preparing for grammar FST.

-win]:dragonfly::AgfCompiler::CompileGrammar():dragonfly\compile-graph-agf.hh:320) Returning graph with 10 states

[KALDI severity=-1] Trying to survive fatal exception: bad allocation Traceback (most recent call last): File "D:\Kaldi\test.py", line 5, in recognizer = PlainDictationRecognizer() # Or supply non-default model_dir, tmp_dir, or fst_file

File "D:\Kaldi\kaldi_active_grammar\plain_dictation.py", line 44, in init top_fst=top_fst_rule.fst_wrapper, dictation_fst_file=dictation_fst_file, **kwargs)

File "D:\Kaldi\kaldi_active_grammar\wrapper.py", line 438, in init if not self._model: raise KaldiError("failed nnet3_agf__construct")

kaldi_active_grammar.KaldiError: failed nnet3_agf__construct

Would appreciate any guidance to make this work. Thanks :)

fastandcuriousX commented 1 year ago

I was tinkering with it and not sure what I did, but everything seems to works fine now. Thank you :)