nyumaya / nyumaya_audio_recognition

Classify audio with neural nets on embedded systems like the Raspberry Pi
https://nyumaya.com
Apache License 2.0
82 stars 14 forks source link

Respeaker far field hardware for RaspPi #10

Closed jonsmirl closed 5 years ago

jonsmirl commented 5 years ago

Seeed makes a cheap ($25) 4-channel far field mic board for the Rasp Pi, you can buy it from Amazon or direct. I have one is it works without issue. The four mics appear as a 4-channel ALSA device.

https://www.seeedstudio.com/ReSpeaker-4-Mic-Array-for-Raspberry-Pi-p-2941.html

A very useful addition would be instructions for replacing Snoboy with your keyword code in the Repeaker tutorial.

yodakohl commented 5 years ago

Thanks, I will try to do that.

jonsmirl commented 5 years ago

I have one of these on order, Playstation Eye. It is supposed to appear as a 4-channel USB microphone under Linux. AFAIK it is just a microphone, no processing is done locally. The idea is to get a mic array on my desktop for software development purposes. https://www.amazon.de/Playstation-PS3-eyetoy-Kamera-Gro%C3%9Fpackung/dp/B00LME2JGQ/ref=sr_1_1

yodakohl commented 5 years ago

Snoboy with your keyword code in the Repeaker tutorial. Could you give me a hint where the Respeaker tutorial is? All tutorial links on their page seem dead for me.

jonsmirl commented 5 years ago

http://wiki.seeedstudio.com/ReSpeaker_4_Mic_Array_for_Raspberry_Pi/

yodakohl commented 5 years ago

Thanks, their repository is interesting.

It seems like they don't do any Beamforming but simply running the keyword detection for each channel separately.

(voice_engine/route.py

for ch in range(src.channels): k = KWS(sensitivity=0.5) k.set_callback(gen(ch)) kws.append(k)

They feed the snowboy detector with 16-bit int bytes which is already the matching format. To make it work for my recognition the only real modification would be chunking the input data to have the correct length. This can be easily done with my implementation in ringbuffer.py. Gonna try this soon.

jonsmirl commented 5 years ago

Their more complex code is in this repository https://github.com/voice-engine/voice-engine

They do beamforming somewhere in their code, I will look around and see if I can spot where it is.