dscripka / openWakeWord

An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
Apache License 2.0
547 stars 47 forks source link

Does VAD is really working ? #146

Open dilerbatu opened 3 months ago

dilerbatu commented 3 months ago

Hi I have a audio data like this:

audio = np.frombuffer(audio_msg.data, dtype=np.int16)

self._oww = Model(
                wakeword_models=[self._OWW_MODEL_PATH],
                inference_framework=self._OWW_INFERENCE_FRAMEWORK,
                vad_threshold=1
            )

and basically I run oww like this:

for mdl in self._oww.prediction_buffer.keys():
                    scores = list(self._oww.prediction_buffer[mdl])
                    curr_score = format(scores[-1], '.20f').replace("-", "")
                    print("Current score:", curr_score, end='\r', flush=True)

                    score = scores[-1]
                    if score >= self._OWW_THRESHOLD:
                        print('Detected wakeword by openwakeword with the score of', score)
                        break

My chunk size is 512. How does it work ? My VAD threshold is 1. It should not create score at all but when I say my wakeword it recognize it.

dscripka commented 3 months ago

It's difficult to determine what might be happening without seeing the full code, but you can start by inspecting the VAD object prediction buffer. This will show what the VAD model is producing, and then you can review the logic here and see if everything is functioning as expected.