Closed mojosoeun closed 8 years ago
This is a known issue. My current theory is that because of the high volume of mirrors we are collectively using up Electron's Google speech key.
I am currently investigating solutions (and am open to suggestions)
Any updated theory on this? I'm facing a very random issue where my code was working earlier in the day and suddenly stopped working, no voice detection.. nada..
Here's another issue I opened on annyang that details all my attempts https://github.com/TalAter/annyang/issues/188
You are correct in you assumption that the issue is with key utilization (there have been may discussions about this on the gitter chat, which I suggest you check out).
I'm looking at alternatives (BlueMix, Microsoft, etc) as well as investigating offline Keyword Spotting to reduce quota usage.
Another update: I've got keyword spotting functioning in the evancohen/keyword-spotting branch. There are a number of issues that exist with this implementation, namely poor performance and some comparability issues with certain microphone setups.
I just set it up today and it seems I already made too many requests. I have my own Google API keys but in a matter of 2-3 minutes I made over 500 requests and now it seems to be down for me.
How do I add this branch to my existing git folder? ( on the raspberry )
I am still getting "Google speach recognizer is down :(" when I plug in any sort of microphone in to it. I have my Own Speech Keys. May I suggest that its a driver issue. Is it set to only be compatible with a list of mics? (PS: It was "Say "What Can I Say" to see a list of commands" when I unplug the microphone.)
Also I get the "[1444:0426/103700:ERROR:logging.h(813)] Failed to call method: org.freedesktop.NetworkManager.GetDevices: object_path= /org/freedesktop/NetworkManager: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.NetworkManager was not provided by any .service files" Message in the Terminal When I start it.
I have started from scratch. I downgraded from Jessie to wheezy and I am following the documentation exactly as it is printed (except config.js). I am using a USB camera as a mic as it states in the documentation also. Will post results soon
Ok, my issue now, still with sound, is getting the usb camera mic as the mic being used, I can use any USB sound device for anything. I have tried turtle beach px22 controller, USB sound card, and a USB camera
@skydazz have you tried following the directions in the troubleshooting section of the documentation? You may also want to look at #20 (which was an old thread on the issue that may help you find an alternative solution)
@skydazz Were you able to resolve the issue you had with " Failed to call method: org.freedesktop.NetworkManager.GetDevices: object_path= /org/freedesktop/NetworkManager: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.NetworkManager was not provided by any .service files"
I have the same issue and can't figure out how to resolve it. Thanks.
@Sachin1968 that error is unrelated to this thread and is harmless - you can be safely ignore it. You can find more info on this Chromium forum post.
So, another update for you all :) Keyword spotting officially works in the in the evancohen/keyword-spotting branch. Unfortunately the Pi is not quite powerful enough to process everything in real time. Because of that I've added a clap detector to that same branch, all you have to do is clap (a configurable number of times) and the mirror will start listening to you.
When using this branch on the Pi there are a few things you need to know
You'll have to install sox
(it's a dependency for clap detection)
sudo apt-get install sox
You will also have to run npm install
after switching to this branch because of the new dependencies. Make sure you update your config.js
file to reflect the new properties in config.example.js
!
Since this is all very new stuff I haven't had the chance to test it extensively. I already anticipate there being issues with the clap detection microphone configuration, luckily this is totally something that you can set up. In your config you can use the clap override
object to change the following settings for clap detection:
overrides : {
AUDIO_SOURCE: 'hw:1,0', // this is your microphone input. If you don't know it you can refer to this thread (http://www.voxforge.org/home/docs/faq/faq/linux-how-to-determine-your-audio-cards-or-usb-mics-maximum-sampling-rate)
DETECTION_PERCENTAGE_START : '5%', // minimum noise percentage threshold necessary to start recording sound
DETECTION_PERCENTAGE_END: '5%', // minimum noise percentage threshold necessary to stop recording sound
CLAP_AMPLITUDE_THRESHOLD: 0.7, // minimum amplitude threshold to be considered as clap
CLAP_ENERGY_THRESHOLD: 0.3, // maximum energy threshold to be considered as clap
MAX_HISTORY_LENGTH: 10 // all claps are stored in history, this is its max length
}
As always, if you have any questions you can post them here or ask on gitter.
When Commands aren't working
Since commands are intermittent in the dev
branch I've added a shim to Annyang to "simulate" a request. This can be done in the dev console with the following:
annyang.simulate("what can I say");
@Keopss we'll get you sorted out in gitter :)
I had the same problem with overusing my 50 speech API calls in about 10 minutes of use so I'm happy to test the new "clap" feature.
Hi @evancohen ! i don´t know what happen with my rabs :(
I have installed smart-mirror-master and works fine.
Then i install sox and smart-smirror with keyword then edit config.js and add overrides options but no clap and speech detection.
The geniuses over at http://kitt.ai have created an offline keyword spotter that should work. In order to find out I need your help to train the keyword "smart mirror". Just follow these steps:
1) Go to https://snowboy.kitt.ai/hotword/47 2) Click "Record and download the model" 3) Follow the instructions to train the model (be sure to click "save" at the end!)
I'll continue to keep this thread updated with my progress and should hopefully have a working prototype this weekend!
Hi! how gone? fine?
I have a working implementation of keyword spotting, I'm currently trying to fix an issue on the Pi 3 that causes recognition to fail because of a native PulseAudio issue.
Hello, I've succesfully installed smart mirror and I am very impressed!
However, I also get "Google Speech Recognizer is down"...
How long will it take for you to implement the new solution into the main branch?
Just an idea: Could you use jasper with pocketsphinx (http://jasperproject.github.io/) to train a number of keywords than then either activate Google STT or Amazon Alexa? Or is Kitt.ai definitely better for keyword recognition?
I tried Jasper... it's quite resource intensive and is painful to use as a dependency (building projects on top of it is great, building integration into an existing project not so much).
I also tried PulseAudio (same native recognition engine that Jasper uses) and it wasn't quick enough to recognize keywords without significant lag.
Snowboy (from the folks at kitt.ai) is super lightweight and very fast. Sure it requires a wrapper for their Python library, but that's not too difficult.
I actually have a working prototype with snowboy on the kws
branch (using the OSX binaries). The only problem on the Pi now is with PulseAudeo, which is having issues with the Pi 3. Once I sort that out (and I think I have a fix) we'll be good to go.
It's been a long journey, with lots of painful dead ends, but I'm feeling really close!
tl;dr Snowboy is great, I should have something working really soon.
cool! What's the problem with pulseaudio?
@trenkert it's an issue with Bluetooth that causes PulseAudio to crap out. Even after disabling it within the config the issue persists (which makes me think there may be another root cause). I'm worried that the real cause is a conflict between dependencies of the mirror and keyword spotter (but I haven't confirmed this yet, and I don't think it's the cause).
Sorry for coming into this late. Snowboy is a C++ library and doesn't have much dependency. It will work as long as you can feed it linear PCM data sampled at 16K, with bits per sample 16, and number of channels 1. PyAudio is only used for demo purpose. If it turns out that PyAudio is the problem, we can turn to other alternatives for audio capturing.
In the Snowboy repo we are trying to add examples of using Snowboy in different programing languages. So far the examples are using PortAudio or PyAudio, but if you look at the code (e.g., the C++ demo code https://github.com/Kitt-AI/snowboy/blob/master/examples/C%2B%2B/demo.cc), you can see that switching the audio capturing tool should be easy.
@evancohen, let me know if it turns out that PyAudio is the problem. We can look into other alternatives for audio capturing.
Hey @chenguoguo thanks for dropping in! Awesome to see you all so committed to your (super awesome) project. I managed to write a pretty hacky IPC between Node and your pre-packaged Snowboy binaries/Python wrapper. It's definitely not the ideal way to use Snowboy with Node, but I just wanted to see if I could get something that would work.
I don't think it would be too challenging to wrap the existing C++ code so it could be easily consumed via Node. I'll take a look at it this weekend if I get the chance 😄
For everyone else: I managed to coax PulseAudio into cooperating on my Pi, and everything seems to work super well! You can test it out by doing the following:
kws
branch on the Pi 2/3:config.kws
object. smart_mirror.pmdl
model with the one you just created in the root of the smart-mirror directory.sudo apt-get install python-pyaudio python3-pyaudio sox
As always let me know if you have any issues over on gitter.
That's great @evancohen! @amir-s is also helping us working on the NodeJS module, see the issue here. He'll likely get something soon.
@evancohen I've experienced a similar issue. I would guess it has to do with pulseaudio-bluetooth. It works for me when I start pulseaudio manually once again after login.
@evancohen Thanks. I rebuilt it all seemed to be fine on my lab monitor in my office however when I moved it to its perm location speech stopped working. Also how do i exit it and get to the main desktop with menus. right now if I alt+f4 is closes the window but I can't see any menus to go through pi settings or launch terminal etc.. Thanks again.
@ojrivera381 is the Pi still connected to the same WiFi network? Have you exceeded your 50 query/day quota? The menu is missing because you have unclutter
installed.
You can probubly also press the windows key on your keyboard, (which opens the Raspbian equivalent of the start menu). You can also get to the terminal via the recycling bin on the desktop (hacky, I know).
If those two things look good, I would follow the instructions for troubleshooting in the docs.
Hello. From 9am to 4pm KST, voice recognition didn't work. But after that time it began working again. I'm trying to figure out why but I can't solve it.