Closed chiragw15 closed 5 years ago
Hey, @chiragw15 .If we get this file where should we store it? I think I can take up this issue. Are you currently working on it or can I go ahead with it?
If we get this file where should we store it?
There is a folder in storage with name snowboy
, we have to store this file in that folder with name susi.pmdl
Please check this PR https://github.com/fossasia/susi_android/pull/889
In this PR, I have mentioned the problems that I am facing. Would be great if you could help me out
@chiragw15 , I was somehow able to extract AMR file from google speech recognizer. Will make a PR as soon as I fix this issue. Could you please tell me all the parameters required by API?
@chiragw15 Please tell me what parameters have to be used?
@hardik124 http://docs.kitt.ai/snowboy/#api-v1-train
@chiragw15 there is a secret user token required. what is that ?
@hardik124
Use your account for now. Will change it later.
@chiragw15 sure. Is it fine if I make an activity or should I make a fragment?
No need of writing code again. I have already written most of code for this issue in #889 . Pull that PR to your local machine and work from there. To pull a PR locally use git fetch upstream pull/889/head:BRANCHNAME
and then checkout that branch using git checkout BRANCHNANE
Yeah, will do.
@chiragw15 , ffmpeg is failing. Is there any other way you know of to convert AMR to WAV. Should I try building other ffmpeg from scratch (NDK )?
ffmpeg is failing.
Exactly. I was facing the same issue. This is the reason I was not able to proceed further. Didn't find another way to do that. Don't build ffmpeg from scratch. @chashmeetsingh Can you help us here. How did you implement this in IOS?
I am not sure how that's supposed to be done on android. For iOS, what I did was:
.wav
fileDid u not verify if the word is infact Susi?
On 22-Sep-2017 7:16 PM, "Chashmeet Singh" notifications@github.com wrote:
I am not sure how that's supposed to be done on android. For iOS, what I did was:
- Run the audio engine
- Save the audio buffer in a .wav file
- Converted that to base64 and used it in the API @chiragw15 https://github.com/chiragw15 ^
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fossasia/susi_android/issues/806#issuecomment-331451146, or mute the thread https://github.com/notifications/unsubscribe-auth/AGV4XL1hYXqgWcfkO4fwJV9qQILIySkFks5sk7oygaJpZM4ORBbn .
I did. That's done with the help of the audio recorder and the speech to text that runs alongside the audio engine. What happens is that the audio engine and speech to text that runs simultaneously, whenever SUSI is spoken, the STT stops and I use the buffer and convert it to base64.
@chiragw15 @chashmeetsingh, I was able to convert using FFmpeg , turns out, there were problems without Uri itself. Conversion takes a lot of time, I think we should make a job scheduler for that and show a progress notification. I am attaching the wav files converted using ffmpeg recordings.zip
@chiragw15 I tweaked ffmpeg to convert files in correct format. However, On making a request on API I am getting a 502 error. Can you look into it?
Is this still relevant?
@batbrain7 @arundhati24 @iamareebjamal Can you please tell me if this issue is still relevant. I know there are bugs with voice detection but I think Snowboy API has already been implemented. I want to improve this issue, but if you can tell me what major issues should be resolved. Please enlighten.
As mentioned in this comment https://github.com/fossasia/susi_android/pull/710#issuecomment-312267662 , implement a feature to train hotword detection using snowboy training API.
Problem for now : The hotword detection uses a model file from snowboy website. This model file is of 2 types :
susi.pmdl
andsusi.umdl
. pmdl stands for personal model and umdl stands for universal model. The process of getting a personal model file (.pmdl) is simple. Just say susi thrice on snowboy website and download the personal model and use it. But the problem with personal model is that it is defined for a specific person. I am using my personal model (susi.pmdl) right now for hotword detection so it works great for me and people with similar voice like me but not for everyone. And to get a universal model file (.umdl), we need minimum 500 person to train the susi hotword on snowboy website by saying 'susi' thrice. Once we have a universal model file, the hotword will work great for everyone. But it may take time since right now I see only 10 people trained the model and we want 490 more.Alternate for this now : Snowboy provides an api to train and get a pmdl file. As mentioned in above comment, according to snowboy docs
We can generate a pmdl model for everyone by asking them say 'susi' thrice after installation of app. When the user will run the app after installation, he will be prompted to say susi thrice. These three recordings will then be sent to this http://docs.kitt.ai/snowboy/#api-v1-train API as a post parameter. The API will return a .pmdl file which then will be used for hotword detection. By this way everyone will use their personal model and hotword detection will work for them smoothly. Once we have universal model trained by 500 person, we can update the app and use universal model which will work for everyone and there won't be a need for every new user to train model with his/her voice.