voixen / voixen-vad

WebRTC-based Voice Activity Detection library
MIT License
131 stars 21 forks source link

What algorithm is this library based on? #6

Open ohtangza opened 7 years ago

ohtangza commented 7 years ago

The additional information will be very appreciated.

Paper reference could be also welcomed!

Overdrivr commented 5 years ago

I cannot seem to find this information as well, some scientific papers compare their performance to the Webrtcvad but no paper can be found. The only reference is the webrtc website...

AlexMaciaFiteni commented 3 years ago

According to several comments in Webrtc's source code it's a statistic algorithm known as Gaussian Mixture Model (GMM).

See line 113 (for example) of this file in the link provided in this repository's description: https://chromium.googlesource.com/external/webrtc/+/branch-heads/43/webrtc/common_audio/vad/vad_core.c#113

Now the best reference I've found for my purpose is in spanish (because I'm spanish), but I'll leave it here just in case someone needs it: https://repositorio.usm.cl/bitstream/handle/11673/23680/3560900257269UTFSM.pdf?sequence=1&isAllowed=y