Closed TheSingletonDev closed 1 year ago
If you want to detect silence all you need is calculate the power of incoming sound. The simplest way is to calculate the mean of the squares of some or all values in each sample, and then compare that to some threshold
practically speaking, you'll have to retrieve a sample, calculate its power, and then process it further if it's above the threshold.
stream.listen(data {
if (mean_squared (data) > threshold) {
// sound
} else {
// silence
}
}
practically speaking, you'll have to retrieve a sample, calculate its power, and then process it further if it's above the threshold.
stream.listen(data { if (mean_squared (data) > threshold) { // sound } else { // silence } }
Thank you so much @anarchuser But I would nudge you a little bit more to please help me with the mean_squared method as well. I havent worked with samples with mathematics. It would really be a great help. If you please.
data is a list of values. For every value in data, you want to create the square of that value (value * value
). Then, for all squared values, calculate the average (sum all values up, then divide by the number of values)
have a look at https://en.wikipedia.org/wiki/Mean_square for the math
Thank you mate.. Another clarification requested is that, this will continuously separate the data in either speech vs silence. How can I incorporate that the speaker was silent for 0.5s or 1s so that I can take action accordingly.
Please refer to #41 for anything related to human speech.
practically speaking, you'll have to retrieve a sample, calculate its power, and then process it further if it's above the threshold.
stream.listen(data { if (mean_squared (data) > threshold) { // sound } else { // silence } }
Thank you so much @anarchuser But I would nudge you a little bit more to please help me with the mean_squared method as well. I havent worked with samples with mathematics. It would really be a great help. If you please.
I tried this but it turns out when I am not speaking there are 2 values being generated, one is 0.0 and another is 18.14... I am guessing that since the Raw Data has full cycle, therefore the mean squared is either 0 or 18 and any speech when done is somewhere in 17 to 17.5 range.
Just my observation.
The code I did:
double meanSquare(Uint8List value) {
var sqrdValue = 0;
for (int indValue in value) {
sqrdValue = indValue * indValue;
}
return sqrdValue / value.length;
}
and its usage
MicStream.microphone(
audioSource: AudioSource.DEFAULT,
sampleRate: 44100,
channelConfig: ChannelConfig.CHANNEL_IN_MONO,
audioFormat: AudioFormat.ENCODING_PCM_16BIT)
.then((stream) {
_listerner = stream?.listen((value) {
double meanSquared = meanSquare(value);
print('Mean Squared: $meanSquared');
if (meanSquared >= 1) {
_socketConnect.socketEmit(.....);
} else {
// silence
}
But since if else will work on each value stream received, its not very effective. My else part (silence) has to run only when I have more than 1 second silence. I am not able to achieve that.
P.S. I already read through #41 but it was of no use. This mean squared is much better.
This mean square method is not working. I dont think this is conceptually correct. Even with the speech, the output generated is many times 0 or 18.
We might be missing something here. Calculating dB or Amplitude should be good way to do. If @anarchuser you could help with amplitude or dB code, coz #41 also just has concept, not the actual code.
my bad, you need to normalise the samples first, e.g., shift from 0..255 to -128..128
my bad, you need to normalise the samples first, e.g., shift from 0..255 to -128..128
So should I be using simple subtraction method like subtract 128 from every value and make a new list or is there any specific method to covert int16 data to obtain values in -128 to 128?
Okay so I have changed this line
double meanSquared = meanSquare(value);
to
double meanSquared = meanSquare(value.buffer.asInt8List());
and meanSquare method will now accept Int8List
instead of UInt8List
if you use 16 bit PCM, then just subtract 32768 from every value.
Hi Dev, We are using your library extensively for retrieving streaming audio data. We would appreciate if along with stream data, we can also obtain dB value which we can use for silence detection e.g. stream = await MicStream.microphone(....)
if we can do
Basically, anyway we can do silence detection