Closed helpinghandindia1 closed 1 year ago
can you suggest other good suitable model for IN-English.
We do not have better model unfortunately.
how and where to add words for more clarity of some specific words.
See https://alphacephei.com/vosk/lm
Any suggestion/solution to use microphone.py with asterisk dialplan step by step, so that users can take benefit directly, as of now we are using wav with test_ffmpeg.py. we tried two-three time to recompile with vosk-asterisk but not found - res_speech_vosk.so and not able to use.
You have to describe issues you have in compilation of vosk-astrisk. There is no workaround. There is vosk-unimrcp sever though, you can use it with asterisk-unimrcp.
Hello @nshmyrev
Pls suggest better use for getting speech content to a audio file, as of now we are using wav file. pls suggest what is better way to get clear content for better output.
exten => s,n,Record(/root/tempfiles/test2/${callerdatafile}:wav,0,2) exten => s,n,System(/usr/bin/ffmpeg -i /root/tempfiles/test2/${callerdatafile}.wav -vn -ar 44100 -ac 2 -b:a 192k /root/tempfiles/test2/${callerdatafile2}.mp3) exten => s,n,Set(tempvalue11=${SHELL(/usr/local/bin/python3 /opt/vosk-server/websocket/test_ffmpeg3.py /root/tempfiles/test2/${callerdatafile2}.mp3 | grep '"text" :' | cut -d'"' -f4)})
exten => s,n,Set(tempvalue11=${SHELL(/usr/local/bin/python3 /opt/vosk-server/websocket/test_ffmpeg3.py /root/tempfiles/test2/${callerdatafile2}.mp3 | grep '"text" :' | cut -d'"' -f4)})
You'd better send wav file to conversion, not mp3. Lossy mp3 codec degrades accuracy.
thanks @nshmyrev for your reply.
Yes we are using wav file for conversion, we are testing mp3 only for english en case.
Would be great if you able to guide us regarding microphone.py implementation for direct speech from users, without any audio file intervention.
Would be great if you able to guide us regarding microphone.py implementation for direct speech from users, without any audio file intervention.
Asterisk works with sip clients, not microphone. You need to configure sip gateway probably.
Feel free to reopen if you have other questions
Hi Team,
I am currently working on passing audio signals from the unimrcp demo_recog to a Python socket-based program. While I am able to pass the signal, I encounter an issue when saving the audio to a file and subsequently playing it back. Instead of hearing the intended human voice, the audio playback consists solely of a buzzing noise.
I have tried several different ffmpeg conversion commands, but none of them have resolved the issue. Below are the commands I have attempted:
ffmpeg -i input.wav -ar 8000 -f s16le -y output.ulaw ffmpeg -i input.wav -ar 8000 -f mulaw -y -map_channel 0.0.0 output.ulaw ffmpeg -i input.wav -ar 8000 -f mulaw -y -map_channel 0.0.0 output.ulaw ffmpeg -i input.wav -ar 4000 -f mulaw -y -map_channel 0.0.0 output.ulaw ffmpeg -i input.wav -ar 16000 -f s16le -y -ac 1 output.ulaw ffmpeg -i input.wav -ar 16000 -f s16le -y -ac 1 output.ulaw ffmpeg -i input.wav -ar 8000 -f mulaw -y output.ulaw ffmpeg -i input.wav -ar 8000 -f mulaw -y output.ulaw
The codecs specified in my mrcp.conf file are as follows:
codecs = PCMU PCMA L16/96/8000 telephone-event/101/8000
I would greatly appreciate any guidance or suggestions on how to resolve this issue and achieve clear audio playback.
Hi, we are running vosk server with docker with two modes, kaldi-hi and kaldi-en-in. Kaldi-hi is good to use but en-in not seems appropriate for taking words and sentences.
thanks