Ant-Brain / EfficientWord-Net

OneShot Learning-based hotword detection.
https://ant-brain.github.io/EfficientWord-Net/
Apache License 2.0
231 stars 37 forks source link

Hotword detection triggers the moment any sound is being playd, even with the default models #7

Closed TrackLab closed 2 years ago

TrackLab commented 2 years ago

So I've been trying to make a custom hotwork. But after seeing it trigger all the time, the moment any kind of sound is being recorded, I decided to use a default one, like "brightness", "mobile", "google", etc.

They all trigger immediatley. Using the default values for the HotWordDetector, by the way. Any clue why? It seemed to have worked great in your video presentation.

Not using a cheap ass microphone by the way.

TheSeriousProgrammer commented 2 years ago

are you using custom threshold? can you share your the following information: Basic Inference Code you use attach a .zip containing your xyz_ref.json file , sample audio files used to generate the reference

TrackLab commented 2 years ago

I already said in my issue post that I am using the very default project. Default values, default threshold, default models included in the project. Nothing about this is custom.

TheSeriousProgrammer commented 2 years ago

can you send a video recording of cli output when you run "python3 -m eff_word_net.engine", where you try pronouncing google , alexa , etc..

aman-17 commented 2 years ago

I already said in my issue post that I am using the very default project. Default values, default threshold, default models included in the project. Nothing about this is custom.

Hey there, can you increase the threshold value and try it? Because few hotwords need higher threshold values to shut false positives.

TrackLab commented 2 years ago

I already said in my issue post that I am using the very default project. Default values, default threshold, default models included in the project. Nothing about this is custom.

Hey there, can you increase the threshold value and try it? Because few hotwords need higher threshold values to shut false positives.

I already did. Even with 0.99 it triggers alot of times false positive. With a custom model using the IBM Sound generation it false triggers even more often. Also worth noting, when it triggers, it prints "DETECTED!" several times after each other, as if it triggers 5 times per single trigger.

aman-17 commented 2 years ago

Thanks for the reply, we will look into it and update you shortly.

TheSeriousProgrammer commented 2 years ago

I already said in my issue post that I am using the very default project. Default values, default threshold, default models included in the project. Nothing about this is custom.

Hey there, can you increase the threshold value and try it? Because few hotwords need higher threshold values to shut false positives.

I already did. Even with 0.99 it triggers alot of times false positive. With a custom model using the IBM Sound generation it false triggers even more often. Also worth noting, when it triggers, it prints "DETECTED!" several times after each other, as if it triggers 5 times per single trigger.

This is actaully an expected thing, themodel works over sliding windows of the audio stream, so for every utterance , there will be 4to5 triggers, this can be temporarily mitigated by treating the triggers as square waves. I was a bit a lazy to add it in the library. In 2-3 days I'll make a patch and update

TheSeriousProgrammer commented 2 years ago

Made some major changes in HotwordDetector instance, to make sure that ther should be only one trigger per utterance. Made few with assumptions such as no utterance will be longer than a second and a there will be atleast 800ms window between 2 utterances , update your python package and try running the demo code. Let me know if it works good

Happy New Year btw!!!

TheSeriousProgrammer commented 2 years ago

Due to inactivitiy I am closing this issue and moreover the patch is pushed . Feel free to reopen it @TrackLab