wit-ai / wit-ios

Wit.ai iOS client
Other
222 stars 71 forks source link

Power of two #37

Closed donpark closed 9 years ago

donpark commented 9 years ago

I was looking at the VAD code and found this following code calculating samples_per_frame:

cvad_state->samples_per_frame = pow(ceil(log2(cvad_state->sample_freq/120)),2); //around 100 frames per second, but must be a power of two

Inline comment say value must be a power of two yet pow function is called with 2 as the exponent instead of base. If sample rate is 44100, samples_per_frame will be 81 using above code.

OTOH, following code which just reverses arguments to pow:

cvad_state->samples_per_frame = pow(2, ceil(log2(cvad_state->sample_freq/120)));

will set samples_per_frame to 512.

Does VAD code work at all?

Catacola commented 9 years ago

Hi donpark,

Thanks for catching this bug. The choice of power of two is for speed. If the sample size is not a power of two (or in many cases, a product of small primes), the FFT algorithm will just default to the standard, slow Fourier transform. It seems we got lucky with the sample rate of 16000 (the sample rate used in the SDK) which results in 64 samples per frame.

I've corrected the issue and will release the fix in the next patch.

Thanks again for the good eye.

donpark commented 9 years ago

NP. FYI there are two other calls to pow with 2 passed as exponent in the same file which I suspect may need to be addressed.

Catacola commented 9 years ago

Found them. Thanks!

donpark commented 9 years ago

Please note that the other two calls may be correct as is since they appear to involve 'sum of square'. I'm just mentioning them for review.