hcmlab / vadnet

Real-time Voice Activity Detection in Noisy Eniviroments using Deep Neural Networks
http://openssi.net
GNU Lesser General Public License v3.0
419 stars 77 forks source link

how to train it with my data #10

Closed dyyzhmm closed 5 years ago

dyyzhmm commented 5 years ago

Thanks for the project. I have found the official source train data comes from http://verteiler5.mediathekview.de/Filmliste-akt.xz in train/code/playlist.py download_list function. I have downloaded the Filmliste-akt.xz , but I can't figure out what it is. So, can you give me some details about your data? And , how could I train it with my data (not from youtube) ? In my thought, I should prepare two kind of audio (voice and noise) , is that enough?

frankenjoe commented 5 years ago

You will have to to implement a class that derives from SourceBase (see source\base.py). The function next() is supposed to return a matrix of size number_of_frames x frame_size and a vector of size number_of_frames, which assigns a label id to each frame, e.g. 0 = noise and 1 = speech (names should be returned by get_targets). Have a look at audio_vad_files.py and you'll quickly understand the mechanism. Finally, you will have to replace the --source parameter in do_train.cmd with your new class, e.g. --source MyAudioSource.

dyyzhmm commented 5 years ago

Yes, thanks . I have made it.

saumyaborwankar commented 3 years ago

Hi @dyyzhmm can you share how you trained it on your data?

dyyzhmm commented 3 years ago

@saumyaborwankar Sorry, I can't remember the detail about it.