Open thisislvca opened 1 month ago
Quick correction: the normalization on my end breaks the detection of every sample. The VAD does detect some speech, but not in more than one sample... I tried different mics and different loudness. I'm running on macOS 14.7 on an M1 MacBook.
Hey I agree the vad doesn't work perfect. I'm not sure why and there must be some issue we didn't found yet. You can call the loudness normalization conditionally, I don't call it automatically - it's up to you. The original silero vad implementation is in https://github.com/snakers4/silero-vad/tree/master/examples/rust-example Maybe we've missed something important from there.
Makes sense, agree! Thanks for the reply, will update this if I find anything new.
Hey! Playing around with the threshold helps. It'd be cool to have the option to edit the activation threshold however you like.
I also found the "Unknown" result pretty confusing - usually, you have the threshold, and if you want to have multiple thresholds you do it yourself...
Curious to hear your thoughts :)
Hey man! Great job with the library, been super duper helpful.
I've been running some tests with live speech, which will be my use case, and I've seen that oftentimes when the audio gets normalized because it's too loud, for some reason the speech detection gets screwed up.
I think being able to choose just to make the sound more loud when it's not enough to be recognized would be a nice addition :)