Ant-Brain / EfficientWord-Net

OneShot Learning-based hotword detection.
https://ant-brain.github.io/EfficientWord-Net/
Apache License 2.0
234 stars 37 forks source link

Error While creating and using custom wake word #50

Open rpatapa opened 7 months ago

rpatapa commented 7 months ago

Hi,

I installed EfficientWord-Net in Python 3.9 environment on windows laptop.

There have been few issues during installation which got resolved as mentioned below (for your information):

======================Installation======================================================= Installed in venv: (env_p39) This env was created with Python 3.9 uisng the command: conda create -n env_p39 python=3.9

To overcome the PyAudio issue when "pip install EfficientWord-Net" is issued, PyAudio was installed using the following commands:

import eff_word_net reported a tflite_runtime missing. This got fixed when following command is issued: python -m eff_word_net.engine

====================================================================================

After installation, the following default wakeword code worked without errors:

image

Now, I got down to create a custom wakeword 'eye_square'.

Issue1:

I was able to create the reference json file when I used --model-type first_iteration_siamese image The created reference json file is attached below for reference: eye_square_ref.json

But when I was trying to use the same in the code, I run into an error saying model file missing.

`import os from eff_word_net.streams import SimpleMicStream from eff_word_net.engine import HotwordDetector

from eff_word_net.audio_processing import First_Iteration_Siamese, ModelRawBackend, Resnet50_Arc_loss

from eff_word_net import samples_loc

base_model = baseModel()

mycroft_hw = HotwordDetector( hotword="eye_square", reference_file=os.path.join(samples_loc, "eye_square_ref.json"), threshold=0.7, relaxation_time=2 )

mic_stream = SimpleMicStream( window_length_secs=1.5, sliding_window_secs=0.75, )

mic_stream.start_stream()

print("Say Mycroft ") while True : frame = mic_stream.getFrame() result = mycroft_hw.scoreFrame(frame) if result==None :

no voice activity

    continue
if(result["match"]):
    print("Wakeword uttered",result["confidence"])`

The error I get is :

runfile('C:/Users/rpratapa/Documents/Code Base/SW/audio-similarity-main/audio_similarity/untitled0.py', wdir='C:/Users/rpratapa/Documents/Code Base/SW/audio-similarity-main/audio_similarity') Traceback (most recent call last):

File ~\anaconda3\envs\env_p39\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec exec(code, globals, locals)

File c:\users\rpratapa\documents\code base\sw\audio-similarity-main\audio_similarity\untitled0.py:18 mycroft_hw = HotwordDetector(

TypeError: init() missing 1 required positional argument: 'model'

Issue 2:

When I try to create the custom wakeword reference json using --model-type resnet_50_arc, i get AssertionError as captured in the screenshot below: image

Questions:

  1. Due to the above two issues, I am not able to create custom wakeword and use it on windows laptop. I am hoping that I get some help from you on both these issues and get successful in running the custom wakeword on my windows laptop
  2. I see that the resnet_50_arc may require about ~90 MB RAM. Do you think this I will be able to run these wakewords on Raspberry Pi zero? Alternatively, can we generate customwake word using 'first_iteration_siamese' and be able to run it on the Pi Zero as this model apparently requires less RAM? Please clarify.

Thanks!!

Update:

I could resolve Issue1 with the following edits:

from eff_word_net.audio_processing import First_Iteration_Siamese, ModelRawBackend, Resnet50_Arc_loss

base_model = First_Iteration_Siamese()

print('cwd: ', os.getcwd()) mycroft_hw = HotwordDetector( hotword="eye-square", model = base_model, reference_file=os.path.join(samples_loc, "eye-square_ref.json"), threshold=0.7, relaxation_time=2 )

Issue2 & Question2 remain to be addressed.

TheSeriousProgrammer commented 7 months ago

There should not be much issues w.r.t using the 90MB model in a pi zero in standalone, however using it with other code or other model could be problematic. The first_iteration_siamese is light weight like you mentioned but was not very well trained. But if the performance of the same is good enough for you, you can very well proceed with the same.

I have not been updating the repo or maintaining it regularly for last few months which results in all these errors, my plan is to create a new iteration of models with different size variants and create better code for the same as well. Just stay tuned

I will however attempt to replicate and fix these issues

rpatapa commented 7 months ago

Thank you very much!!

If you could take a look at Issue2 and provide a way to use/test resnet_50_arc model, I would like make some bench markings.

Looking forward for it!