jtkim-kaist / VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
834 stars 232 forks source link

ACAM always detect badly on the start of a corpus #16

Open lyapple2008 opened 5 years ago

lyapple2008 commented 5 years ago

As the title said, I found the corpus at the beginning always be detected as non-speech. Can you explain it? image

jtkim-kaist commented 5 years ago

Hi, is there any silence in front of your sample, if not, the result may be not good. Because ACAM is context based model, there should be some samples to capture the speech context. Please send me your sample to jtkim@kaist.ac.kr I'll debug it for you.

lyapple2008 commented 5 years ago

Thank you for your reply. And I had sent the test audio to your email.