dpirch / libfvad

Voice activity detection (VAD) library, based on WebRTC's VAD engine
BSD 3-Clause "New" or "Revised" License
498 stars 176 forks source link

I'm using unsigned char #5

Closed SephVelut closed 6 years ago

SephVelut commented 6 years ago

My audio data is in the form of unsigned char* arrays. fvad_process takes a signed short. Do I need to just convert from char to short? Will there be a loss of correctness as far as the vad is concerned?

dpirch commented 6 years ago

This depends on what format your audio data is in. The chars may not be individual samples, but bytes of audio data with larger samples, for example 16-bit samples where each sample takes up two bytes of the array.

If you know your array contains signed 16-bit samples with the correct endianness, then you can just cast the array to (int16_t*) and pass it to the library (in chunks with the required length, see comment in fvad.h). Otherwise you would have to calculate the individual sample values first.

To check if your data is signed 16-bit audio, you could write it to a file (with fwrite) and try playing it with ffplay (part of FFmpeg), for example

ffplay -f s16le -ar 8k -ac 1 yourfile.raw

(for 8kHz sample rate, and assuming your system is little-endian).

SephVelut commented 6 years ago

Thank you. I'll close this as its not an issue and may report back once I've got around to implementing the cast. Most likely I need shift the individual bytes to two bytes in a int16_t variable.