Open trappedinspacetime opened 1 year ago
It sounds like there are some potential bugs in command.cpp
, I will go check it out.
@bobqianic Thank you for responding. Could it be related to VAD module?
Any progress?
@trappedinspacetime
You can try adjusting the VAD-related parameters:
-vth N, --vad-thold N [0.60 ] voice activity detection threshold
-fth N, --freq-thold N [100.00 ] high-pass frequency cutoff
Probably the default values are not OK for your setup
@ggerganov thank you for responding.
I tried -vth
values such as 0.4
0.8
0.9
1.4
1.7
nothing has changed.
@trappedinspacetime I have the same problem like this: "process_general_transcription: Speech detected! Processing ... process_general_transcription: Heard 'you', (t = 2132 ms, p = 72.20%) process_general_transcription: WARNING: prompt not recognized, try again", have you solved this problem?
@trappedinspacetime I have the same problem like this: "process_general_transcription: Speech detected! Processing ... process_general_transcription: Heard 'you', (t = 2132 ms, p = 72.20%) process_general_transcription: WARNING: prompt not recognized, try again", have you solved this problem?
Unfortunately, no. I hope somebody finds a solution.
Happened to me when I used my Mac with monitor. When Mac is closed it doesn't get any sounds in mic (I guess) and b/c of that (I guess) there are strange "you" listened by program. Try to use separate microphone (in my case Airpods worked well).
First of all, I thank you Georgi Gerganov and all who contributed to this project. I have a progressive neuro-muscular disease and I almost can not use my hands. I bought a new android mobile to ease my life. It has 4GB+4GBVRAM. I tried to use "Hey Google" voice assistant together with "Google Voice Access". I am not a native English speaker, "Hey Google" is missing some features in my language. It doesn't hang up when I accidentally call someone. It has many weak points and bugs indeed. And it runs only online with an internet access.
I tested "whisper.cpp" "./command -m models/ggml-tiny.bin -t 8 -ac 768", in my Ubuntu 22.04 it works well. I managed to build it in my android mobile. It launches without error but it repeatedly prints:
process_general_transcription: Say the following phrase: 'Ok Whisper, start listening for commands.'
without letting me pronounce the phrase "Ok Whisper, start listening for commands."I plan to use it to end a phone call and for other tasks. Would you please guide me?