mwv / vad

Voice Activity Detector
Other
72 stars 13 forks source link

Signal shape, return value, documentation. #1

Closed donaldr closed 8 years ago

donaldr commented 8 years ago

What is the expected "shape" of the signal?

Let's say I'm providing a signal that is one second long at 44100 sampling frequency. Does vad expect a one-dimensional numpy array of 44100 entries? Also, what does it return? Should it return an array of 44100 booleans? Also, what's a typical value for threshold?

I'm using python sounddevice InputStream to receive audio.

donaldr commented 8 years ago

I figured it out. It's not the frames that the returned boolean array returns, it's the windows.