Ant-Brain / EfficientWord-Net

OneShot Learning-based hotword detection.
https://ant-brain.github.io/EfficientWord-Net/
Apache License 2.0
231 stars 37 forks source link

complex hotwords support #Current Model Limitations Discussion #18

Open amoazeni75 opened 2 years ago

amoazeni75 commented 2 years ago

Hi, Thanks for your helpful research. I wonder if the current model can handle complex hot words like "Hey Siri" or just handle one word, like "Siri"?

My second question is about hot words that their pronunciation takes more than 1s, like"Hey XXXX." Does your model support changing the recording time?

Did you try to use cosine_similarity instead of Euclidian distance in inference time?

Thanks.

dominickchen commented 2 years ago

+1 We also want to use an about 2-seconds-long custom hotword, but with the current python -m eff_word_net.generate_reference method, the detection seems to be awkward.

So would like to have support changing the recording time too!

TheSeriousProgrammer commented 2 years ago

Sorry for the delayed response, the model was currently trained on single words , however its should work in simple phrases like Hey xxx though . Moreover the current model was trained on 1 sec audio clippings , so bizare behviour might occur on trying to process audio clippings greater in length than 1 sec Pushed a commit https://github.com/Ant-Brain/EfficientWord-Net/commit/c9dee140c6cc44c2adf985f42519e382ee0d0eab expanining the same

The model was trained using Euclidean distance hence works on the same during inference time too

Coming to increasing hotword length, hotwords are usually small , may be we can extend the processing window to 1.5 sec , but 2 sec I am not really sure . Can you give a few examples where a hotword could be greater than 1.5 secs?

Kindly give you additional model suggestions in discussions page https://github.com/Ant-Brain/EfficientWord-Net/discussions/3

Join the same channel and put forward you queries there , planning to create faster , more performant version of current implimentation soon, your suggestions will be helpful

amoazeni75 commented 2 years ago

Thanks for the Information. An example of a long wake word is "Hey MercedesBenz". Could you please provide the training steps?

TheSeriousProgrammer commented 2 years ago

sorry for the delay , didn't have time to clean the repository which held the training code , the same is built using keras https://github.com/Ant-Brain/wakeword_dataset_generator . It has both the training code and dataset generator code

Durgesh92 commented 2 years ago

Hey, thanks for this repo.

I can not find your training code here https://github.com/Ant-Brain/wakeword_dataset_generator . is it available in any other repo?

TheSeriousProgrammer commented 2 years ago

Extremely sorry for the delay, my bad forgot to add the notebook which contained the training code https://colab.research.google.com/drive/1hH6q3cGneIWxNRLwbVAKIBzHoVVFlEO3?usp=sharing

TheSeriousProgrammer commented 2 years ago

Currently working on a newer model with better perfomance and higher hotword length, will be available in a month's time

TheSeriousProgrammer commented 1 year ago

Update

A newer model with better resilience to noise, 1.5 secs window support has been added to the flow . kindly check it out!! (its taken more than a month for the update XD )