kalliope-project / kalliope

Kalliope is a framework that will help you to create your own personal assistant.
https://kalliope-project.github.io/
GNU General Public License v3.0
1.72k stars 228 forks source link

Snowboy stop development #630

Open pepebc opened 4 years ago

pepebc commented 4 years ago

KITT.AI announces: on December 31, 2020 stop the development of all their products Have you thought of an alternative?

Sispheor commented 4 years ago

We are evaluating some alternatives. Fell free to propose as well.

corus87 commented 4 years ago

We don't have much options at the moment, there is still porcupine but its not open source.

With the next Kalliope release we are able to use mycroft precise trigger which is open source, I'm already played a bit with precise, but to create a good wake word, it need to take a lot of training but its not very user friendly.

We need at least a good Kalliope wake word, therefore we need to collect some voice samples from the community then it needs a lot of datasets (for example I have downloaded 50GB from the mozilla common voice datasets) to train my wake word.

I'm going to push a community based precise trigger after the next release of Kalliope.

khimaros commented 3 years ago

a few resources since i was just researching this:

you mention porcupine is not open source, do you mean the models?

Miloune commented 3 years ago

Maybe this one as well: https://github.com/alphacep/vosk-api ?

Sispheor commented 3 years ago

So, the best option is porcupine? I have some time in the next coming days to work on Kalliope. I'll test some trigger engine myself.

corus87 commented 3 years ago

I did take a look into vosk mentioned by @Miloune, it looks promising and at least in the german dataset the word "kalliope" is recognized. I don't have much time lately but a first try with a simple script for a single keyword worked pretty good.

Sispheor commented 3 years ago

Their is also the Snips project implementation. Used by Rhasspy.

Sispheor commented 3 years ago

So for porcupine we cannot generate model for RPI, and generated models need to be recreated every 30 days if I understand well. So for me it looks like a no-go.

khimaros commented 3 years ago

if you are considering vosk, i'd recommend taking a look at mozilla deepspeech which i found to be both easier to configure and considerably more accurate. they also have examples for streaming transcription in python and bulk transcription in python

Sispheor commented 3 years ago

I thought they were more stt engines than wake word engines.

khimaros commented 3 years ago

@Sispheor -- that's right, including vosk. i'm not sure if either is a good fit for a snowboy replacement, but if you're going to go that direction i think deepspeech is a better target from my limited experiments.

khimaros commented 3 years ago

snips looks promising but appears to have been acquired by Sonos. i couldn't find the platform source https://github.com/snipsco?q=platform&type=&language=&sort= and their usage docs have been removed. the closest available is https://github.com/snipsco/snips-record-personal-hotword

Sispheor commented 3 years ago

Another similar project has created a list of wake work engine here. They have created their own called Raven. Not yet a python lib but it seems to be full python. Need to take a look. For me I think the best candidate is mycroft-prcise

pepebc commented 3 years ago

I have tried all and i prefer mycroft-precise, with good accuracy. Raven is still a very young development, with many false positives. And the mentioned vosk is very interesting

Sispheor commented 3 years ago

The problem with Mycroft is mainly that the only supported arm proc is armv7l. I don't find a doc with a full map of all RPI but I think it's recent ones.

corus87 commented 3 years ago

Vosk is working pretty good so far, but its slow, with a small dataset it takes about 1-1.5 seconds until the word is recognize. For an Stt it would be a good offline alternative but as a trigger, may not.

if you are considering vosk, i'd recommend taking a look at mozilla deepspeech which i found to be both easier to configure and considerably more accurate. they also have examples for streaming transcription in python and bulk transcription in python

@khimaros I already played around with deepspeech but for more than a year ago, I guess they did some improvements since then, so I guess I will take new look. For Vosk its very easy to setup, just pip install vosk and download the language model.

The problem with Mycroft is mainly that the only supported arm proc is armv7l. I don't find a doc with a full map of all RPI but I think it's recent ones.

@Sispheor It should support the Rpi 2 + 3 and x86 at least those are the ones I tested, but Rpi 4 should also be supported. Precise is pretty nice, its working good but only with a good trained wake word, and this is very hard to train. I also read about raven (At least on the rhasspy page) and maybe it worth a shot.

Hopefully soon there will come a good alternative to snowboy or wake work training on precise will get much easier...

Edit: Not all x86 CPU's are supported by precise, AVX is required for the default tensorflow package, but most modern CPU's support AVX.

nshmyrev commented 3 years ago

Vosk is working pretty good so far, but its slow, with a small dataset it takes about 1-1.5 seconds until the word is recognize. For an Stt it would be a good offline alternative but as a trigger, may not.

@corus87 this might be an issue with setup or an old version. Which model/language are you using here? We have updated some of our models couple months ago: English, German. We have also updated the code for much faster latency of the answer. Please try to recheck with latest setup.

Hopefully soon there will come a good alternative to snowboy or wake work training on precise will get much easier...

There will be a keyword spotter in Vosk soon, couple weeks. Stay tuned.

corus87 commented 3 years ago

@nshmyrev Thanks for your input! Those are great news you are working on a native keyword spotter, I'm looking forward to it!

I just started earlier this week with Vosk, so I'm using the latest pip version and model you provide here . For my tests, I've made a small script using pyaudio, but even with your python example there is no difference, it takes about 1 second.

I'm running those tests on x86 with a ryzen 4800h, I guess there should be enough CPU power.

nshmyrev commented 3 years ago

I'm using the latest pip version and model you provide here .

Which model version exactly please? 0.15?

For my tests, I've made a small script using pyaudio, but even with your python example there is no difference, it takes about 1 second.

We do not recommend pyaudio exactly due to latency issues.

there is no difference, it takes about 1 second.

Ok, let me check too

corus87 commented 3 years ago

Which model version exactly please? 0.15?

I tried vosk-model-de-0.6, vosk-model-small-de-0.15 and vosk-model-small-en-us-0.15.

Ok, let me check too

Thanks.

HumanG33k commented 3 years ago

hi i just try to install following website instruction. The default setting file use snowboy, there is somewhere an alternative ?

vkuehn commented 3 years ago

that's closed since ages

corus87 commented 2 years ago

The guys from rhasspy built a docker container to run a local web server where you can easily create your own pmdl wake word for snowboy.

https://github.com/rhasspy/snowboy-seasalt

Just install docker and run the command in the readme, it will download the container and starts the web server. Then you can access the interface on http://localhost:8000 and create a wake word in 2 minutes.

R-Jurado commented 9 months ago

Sorry for refloating this old issue, but since I haven't found any recent related issue or article on the matter, may I suggest to support TeachableMachine? It's opensource and easy to use. Don't know if there is an official alternative to Snowboy, I haven't found any.