snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector
MIT License
4.41k stars 432 forks source link

❓ Issue with silero-vad in Song Detection #565

Closed Joy-word closed 2 weeks ago

Joy-word commented 2 weeks ago

❓ Questions and Help

I found that when using silero-vad for voice activity detection in vocal songs, it misses most of the high-pitched parts. I'm wondering if this is related to the project's training data? Is there a way to avoid this during the inference stage? Looking forward to your response.

E631695A-832F-417c-A61C-DE1B9A69B562

snakers4 commented 2 weeks ago

Hi,

It is a known problem with songs / very high voices / children's voices / cartoon voices.

As for music per se - we did not have music in the training data. As for children audio recordings they are much more relatively rare compared to adults.

Joy-word commented 2 weeks ago

OK,Thanks for your response.