Closed chatziko closed 1 year ago
Hi.
That would be great to somehow get local wakeword detection in terms of speed and minimum packets comming through network. I think we can have both of these options, with wyoming and using wakeword detector engine directly, second will be good with low-end machines but with possibility to run wakeword without problem, but it is not maybe related to this pr.
I doubt running a wyoming server on the same machine makes any performance difference wrt an embedded detector, it's a very thin wrapper. But I haven't tested it, maybe it does. In fact, on a very low end machine the ability to run a lighter detector (eg porcupine) might be more important.
Of course having an embedded detector would make it easier to use (only one thing to install).
Yes, If we have A detector which we can run and it is written in more efficient language than python, it would be better for very poor devices which still could be used for satelite rather than run one more python interpreter.
Thanks!
This PR implements wake word detection by forwarding audio to a wyoming server, with the obvious use case of running it locally on the satellite machine.
The alternative "local" approach would be to embed the detector (and of course we can implement both), but I think wyoming has several advantages:
Technical notes:
This PR builds on top of #29 (to keep conflicts to a minimum). But I could rebase if we want to merge it alone.
I found that the mic processing (not recording) code (webrtc, vad, wav writer, etc) was a bit hard to follow, cause each step was intefering with the others (eg wav writing had to take into account the webrtc buffer).
So I rewrote the mic processing logic using a "pipe"-like approach, with clearly independent functions that receive a chunk stream as input and produce a chunk stream as output (with the possibility to modify chunks, buffer them, etc). I put it in a separate
mic_process.py
file, I think you'll like the logic, it's quite clean.After this restructuring, adding the wake word logic was quite easy (it's the second commit). The main challenge was to schedule the async wyoming code in the main thread, and have it controlled by sync functions in the mic thread.
I'd be happy to discuss alternative solutions, of course.