Open mike-2020 opened 2 months ago
Your understanding is correct, but you need to pay attention to the performance of VAD. It cannot be 100% accurate. You should consider the status of previous frames to determine whether certain frames need to be ignored.
Hello,
I understand that this function is used to detect speech in received audio. But when it returns VAD_SPEECH, does it means the current frame (the data input for the current call to this function) contain speech? or it means current frame along with a number of previous frames contains speech?
I'd like to record speech only. So, want to make sure when vad_process returns VAD_SPEECH, it is the right time to start the recording, and will not miss any speech audio.