Open pepebc opened 4 years ago
We are evaluating some alternatives. Fell free to propose as well.
We don't have much options at the moment, there is still porcupine but its not open source.
With the next Kalliope release we are able to use mycroft precise trigger which is open source, I'm already played a bit with precise, but to create a good wake word, it need to take a lot of training but its not very user friendly.
We need at least a good Kalliope wake word, therefore we need to collect some voice samples from the community then it needs a lot of datasets (for example I have downloaded 50GB from the mozilla common voice datasets) to train my wake word.
I'm going to push a community based precise trigger after the next release of Kalliope.
a few resources since i was just researching this:
you mention porcupine is not open source, do you mean the models?
Maybe this one as well: https://github.com/alphacep/vosk-api ?
So, the best option is porcupine? I have some time in the next coming days to work on Kalliope. I'll test some trigger engine myself.
I did take a look into vosk mentioned by @Miloune, it looks promising and at least in the german dataset the word "kalliope" is recognized. I don't have much time lately but a first try with a simple script for a single keyword worked pretty good.
Their is also the Snips project implementation. Used by Rhasspy.
So for porcupine we cannot generate model for RPI, and generated models need to be recreated every 30 days if I understand well. So for me it looks like a no-go.
if you are considering vosk, i'd recommend taking a look at mozilla deepspeech which i found to be both easier to configure and considerably more accurate. they also have examples for streaming transcription in python and bulk transcription in python
I thought they were more stt engines than wake word engines.
@Sispheor -- that's right, including vosk. i'm not sure if either is a good fit for a snowboy replacement, but if you're going to go that direction i think deepspeech is a better target from my limited experiments.
snips looks promising but appears to have been acquired by Sonos. i couldn't find the platform source https://github.com/snipsco?q=platform&type=&language=&sort= and their usage docs have been removed. the closest available is https://github.com/snipsco/snips-record-personal-hotword
Another similar project has created a list of wake work engine here. They have created their own called Raven. Not yet a python lib but it seems to be full python. Need to take a look. For me I think the best candidate is mycroft-prcise
I have tried all and i prefer mycroft-precise, with good accuracy. Raven is still a very young development, with many false positives. And the mentioned vosk is very interesting
The problem with Mycroft is mainly that the only supported arm proc is armv7l. I don't find a doc with a full map of all RPI but I think it's recent ones.
Vosk is working pretty good so far, but its slow, with a small dataset it takes about 1-1.5 seconds until the word is recognize. For an Stt it would be a good offline alternative but as a trigger, may not.
if you are considering vosk, i'd recommend taking a look at mozilla deepspeech which i found to be both easier to configure and considerably more accurate. they also have examples for streaming transcription in python and bulk transcription in python
@khimaros I already played around with deepspeech but for more than a year ago, I guess they did some improvements since then, so I guess I will take new look.
For Vosk its very easy to setup, just pip install vosk
and download the language model.
The problem with Mycroft is mainly that the only supported arm proc is armv7l. I don't find a doc with a full map of all RPI but I think it's recent ones.
@Sispheor It should support the Rpi 2 + 3 and x86 at least those are the ones I tested, but Rpi 4 should also be supported. Precise is pretty nice, its working good but only with a good trained wake word, and this is very hard to train. I also read about raven (At least on the rhasspy page) and maybe it worth a shot.
Hopefully soon there will come a good alternative to snowboy or wake work training on precise will get much easier...
Edit: Not all x86 CPU's are supported by precise, AVX is required for the default tensorflow package, but most modern CPU's support AVX.
Vosk is working pretty good so far, but its slow, with a small dataset it takes about 1-1.5 seconds until the word is recognize. For an Stt it would be a good offline alternative but as a trigger, may not.
@corus87 this might be an issue with setup or an old version. Which model/language are you using here? We have updated some of our models couple months ago: English, German. We have also updated the code for much faster latency of the answer. Please try to recheck with latest setup.
Hopefully soon there will come a good alternative to snowboy or wake work training on precise will get much easier...
There will be a keyword spotter in Vosk soon, couple weeks. Stay tuned.
@nshmyrev Thanks for your input! Those are great news you are working on a native keyword spotter, I'm looking forward to it!
I just started earlier this week with Vosk, so I'm using the latest pip version and model you provide here . For my tests, I've made a small script using pyaudio, but even with your python example there is no difference, it takes about 1 second.
I'm running those tests on x86 with a ryzen 4800h, I guess there should be enough CPU power.
I'm using the latest pip version and model you provide here .
Which model version exactly please? 0.15?
For my tests, I've made a small script using pyaudio, but even with your python example there is no difference, it takes about 1 second.
We do not recommend pyaudio exactly due to latency issues.
there is no difference, it takes about 1 second.
Ok, let me check too
Which model version exactly please? 0.15?
I tried vosk-model-de-0.6
, vosk-model-small-de-0.15
and vosk-model-small-en-us-0.15
.
Ok, let me check too
Thanks.
hi i just try to install following website instruction. The default setting file use snowboy, there is somewhere an alternative ?
that's closed since ages
The guys from rhasspy built a docker container to run a local web server where you can easily create your own pmdl wake word for snowboy.
https://github.com/rhasspy/snowboy-seasalt
Just install docker and run the command in the readme, it will download the container and starts the web server. Then you can access the interface on http://localhost:8000 and create a wake word in 2 minutes.
Sorry for refloating this old issue, but since I haven't found any recent related issue or article on the matter, may I suggest to support TeachableMachine? It's opensource and easy to use. Don't know if there is an official alternative to Snowboy, I haven't found any.
KITT.AI announces: on December 31, 2020 stop the development of all their products Have you thought of an alternative?