kkumatani / distant_speech_recognition

spatial signal processing toolkit a.k.a beamforming toolkit 2.0 (BTK2.0)
MIT License
167 stars 84 forks source link

How to use this project? #1

Open zuowanbushiwo opened 5 years ago

zuowanbushiwo commented 5 years ago

Hi I am very interested in this project, because I have recently done a smart speaker project using Amazon AVS, I also have a set of hardware devices:Amlogic A113X1 Far-Field Dev Kit for Amazon AVS , detials : http://openlinux2.amlogic.com/download/doc/A113X1_Usermanual.pdf , There are six microphones. How to connect these signal processing modules (Speaker tracking, Beamforming, Post-filtering, Speech enhancement, Dereverberation and Echo cancellation ) to form a complete signal processing flow, suitable for hands-free smart speakers using Microphone arrays? can provide a C++ demo? Can I use the Millennium ASR as a wake word detection? thanks!

kkumatani commented 5 years ago

Hi zuowanbushiwo,

The C++ demo is not yet pushed. But implementing a demo program in C++ is straightforward. I will provide it. Don't you want to do quick research in Python?

About the wake word detection: Millennium ASR won't be a good start for wake word detection since it does not have neural networks. A lot of work will need to be done. Are you looking for a low power complexity keyword spotter?

zuowanbushiwo commented 5 years ago

@kkumatani Thank you very much,That's very kind of you. sure, I will research in Python on Ubuntu for understand this project。I have see btk1.0 , follow Beamforming1.0 toolkit User Guide, but In order to run Script_CMU-GHC2109 and Scripts_DS+SD+MVDR ,maybe need some data and config file。 iI am looking for a low power complexity keyword spotter, how about this https://github.com/Picovoice/Porcupine? Best regards

zhljjj commented 3 months ago

I am very interested in this project, but I don't understand how to use this project about dereverberation?If I have a .wav file. I mean in C++. .........