alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.57k stars 1.06k forks source link

No recognition from microphone #1332

Open lucifyq opened 1 year ago

lucifyq commented 1 year ago

The microphone is on everything is being recorded,but I get this log LOG (VoskAPI:ReadDataFiles():model.cc:213) Decoding params beam=10 max-active=3000 lattice-beam=2 LOG (VoskAPI:ReadDataFiles():model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10 LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes. LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components. LOG (VoskAPI:Collapse():nnet-utils.cc:1488) Added 1 components, removed 2 LOG (VoskAPI:ReadDataFiles():model.cc:248) Loading i-vector extractor from models/vosk-model-small-en-0.4/ivector/final.ie LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done. LOG (VoskAPI:ReadDataFiles():model.cc:282) Loading HCL and G from models/vosk-model-small-en-0.4/graph/HCLr.fst models/vosk-model-small-en-0.4/graph/Gr.fst LOG (VoskAPI:ReadDataFiles():model.cc:308) Loading winfo models/vosk-model-small-en-0.4/graph/phones/word_boundary.int

nshmyrev commented 1 year ago

Messages look ok. You can dump the audio with -f option and listen. Most likely microphone is muted.

bfagundez commented 1 year ago

Having the same issue-- tried different devices, unmuted, etc. Does not recognize a single thing.

LOG (VoskAPI:ReadDataFiles():model.cc:213) Decoding params beam=13 max-active=7000 lattice-beam=8
LOG (VoskAPI:ReadDataFiles():model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10
LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
LOG (VoskAPI:ReadDataFiles():model.cc:248) Loading i-vector extractor from /Users/geekymartian/.cache/vosk/vosk-model-en-us-0.42-gigaspeech/ivector/final.ie
LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (VoskAPI:ReadDataFiles():model.cc:279) Loading HCLG from /Users/geekymartian/.cache/vosk/vosk-model-en-us-0.42-gigaspeech/graph/HCLG.fst
LOG (VoskAPI:ReadDataFiles():model.cc:294) Loading words from /Users/geekymartian/.cache/vosk/vosk-model-en-us-0.42-gigaspeech/graph/words.txt
LOG (VoskAPI:ReadDataFiles():model.cc:303) Loading winfo /Users/geekymartian/.cache/vosk/vosk-model-en-us-0.42-gigaspeech/graph/phones/word_boundary.int
LOG (VoskAPI:ReadDataFiles():model.cc:310) Loading subtract G.fst model from /Users/geekymartian/.cache/vosk/vosk-model-en-us-0.42-gigaspeech/rescore/G.fst
LOG (VoskAPI:ReadDataFiles():model.cc:312) Loading CARPA model from /Users/geekymartian/.cache/vosk/vosk-model-en-us-0.42-gigaspeech/rescore/G.carpa
LOG (VoskAPI:ReadDataFiles():model.cc:318) Loading RNNLM model from /Users/geekymartian/.cache/vosk/vosk-model-en-us-0.42-gigaspeech/rnnlm/final.raw
################################################################################
Press Ctrl+C to stop the recording
################################################################################
{
  "partial" : ""
}
nshmyrev commented 1 year ago

You also need to dump the audio with -f option

bfagundez commented 1 year ago

If I dump to an mp4 file using -f hello.mp4 the resulting file is corrupt

nshmyrev commented 1 year ago

It is raw format, not mp4. You need audacity to listen it. Or you can share it here.

bfagundez commented 1 year ago

oh my bad, here it is in audacity: 123 2023-04-11 16-26-04

and here as mp4 (renamed it because github won't let me upload something without an extension) https://user-images.githubusercontent.com/559456/231310010-7d72eb7f-3f6b-4a35-b962-0ba5c7c97621.mp4

nshmyrev commented 1 year ago

So it is plain silence. You need to unmute the microphone or pick another logical device for recording (see -l option of test_microphone.py).

bfagundez commented 1 year ago

I'm pretty sure the microphone is unmuted, I can make recordings with quicktime without problems and I can see feedback when I go to settings here: Sound 2023-04-12 11-05-02 What library is used to capture the sound? maybe I need to isolate that first

nshmyrev commented 1 year ago

Run test_microphone.py -l and provide the list of devices it prints.

Also while running test_microphone.py check settings, maybe the app has no permission to access microphone.

bfagundez commented 1 year ago

ok it worked on Terminal because it asked for permissions to the microphone. It never asked for permissions on wezterm. Thanks for your help @nshmyrev I believe this can be closed.

bfagundez commented 1 year ago

wezterm had an issue (now closed) : https://github.com/wez/wezterm/issues/3359