hcmlab / vadnet

Real-time Voice Activity Detection in Noisy Eniviroments using Deep Neural Networks
http://openssi.net
GNU Lesser General Public License v3.0
419 stars 77 forks source link

vad_extract filter out speech #4

Closed neil3212080 closed 5 years ago

neil3212080 commented 5 years ago

hey sir ,when i use vad_extract with chinese one word,it can not discriminate speech and noise

neil3212080 commented 5 years ago

archive_name.zip

frankenjoe commented 5 years ago

We tested VadNet with different languages and actually got pretty good performance with Chinese. Yet, the model we ship was trained mainly on English/German. Probably, you will have to (re)train with Chinese samples to improve the results. Since the training code has been released (see train folder) we leave the job to you :)

neil3212080 commented 5 years ago

if i want train chinese,just change train/download.py url?or Is there anything else that needs to be modified?

frankenjoe commented 5 years ago

Yes, just make sure you're media files come with subtitles. The following sites are supported: https://rg3.github.io/youtube-dl/supportedsites.html

neil3212080 commented 5 years ago

Hello frankenjoe When I continue to use your model for testing, I found that in the case of a Chinese word, the false positive rate is 28%. I guess the reason for this is not a training set is not sensitive to a word. I want to try to repeat the two words and continue to do the test. Do you think this idea works? Or do you have any better suggestions to improve the accuracy of the model against a word?