Let's say I'm providing a signal that is one second long at 44100 sampling frequency. Does vad expect a one-dimensional numpy array of 44100 entries? Also, what does it return? Should it return an array of 44100 booleans? Also, what's a typical value for threshold?
I'm using python sounddevice InputStream to receive audio.
What is the expected "shape" of the signal?
Let's say I'm providing a signal that is one second long at 44100 sampling frequency. Does vad expect a one-dimensional numpy array of 44100 entries? Also, what does it return? Should it return an array of 44100 booleans? Also, what's a typical value for threshold?
I'm using python sounddevice InputStream to receive audio.