Ant-Brain / EfficientWord-Net

OneShot Learning-based hotword detection.
https://ant-brain.github.io/EfficientWord-Net/
Apache License 2.0
215 stars 34 forks source link

Here that working fine with ref file but not if a record custom file. #12

Closed warichet closed 2 years ago

warichet commented 2 years ago

Hello, i working on google collab, so i don't have access to mic. The work around is to used mp3 or wav file. To do that i have add this class:

from streams import CustomAudioStream
from pydub import AudioSegment

import numpy as np
import wave

RATE = 16000
index = 0

class SimpleFileStream(CustomAudioStream) :

    def open_stream(self, src, mp3):
        if mp3:
          dst = "Data/sample.wav"
          # convert mp3 to wav              
          sound = AudioSegment.from_mp3(src).set_frame_rate(16000)
          sound.export(dst, format="wav")
          self.wf = wave.open(dst, 'rb')
        else:
          print("Not an mp3")
          self.wf = wave.open(src, 'rb')
          self.wf.rewind()
        print("Get params of wav file " + str(self.wf.getparams()))

    def close_stream(self):
        self.wf.close()

    def get_next_frame(self):
        global index
        print("Index ", index)
        index = index + self.CHUNK
        return np.frombuffer(self.wf.readframes(self.CHUNK),dtype=np.int16)

    """
    Implements stream with sliding window, 
    implemented by inheriting CustomAudioStream
    """
    def __init__(self,sliding_window_secs:float=1/8):
        self.CHUNK = int(sliding_window_secs*RATE)

        CustomAudioStream.__init__(
            self,
            open_stream = self.open_stream,
            close_stream = self.close_stream,
            get_next_frame = self.get_next_frame,
        )

It seems working if i used ref file of github. But if i record a custom file using audacity it is not detect the wakeword.

If i change the threshold to 0.7 and the activation count to 2 it is work better, but il will increase the chance of getting false positive.

Is it mandatory to have custom ref for each user ?

Best regards Sebastien

TheSeriousProgrammer commented 2 years ago

It is not required to have custom ref for each user,

while generating the reference file instead of simply increasing the sample count, one could increase the variety of sample audios and then set the threshold to somewhere near 95 or more%

This will reduce the chances of false positives by a great extend

TheSeriousProgrammer commented 2 years ago

Closing issue due to inactivity, feel free to reopen if required