aishoot / LSTM_PIT_Speech_Separation

Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.
306 stars 90 forks source link

dataset #16

Closed Navids71 closed 4 years ago

Navids71 commented 4 years ago

hello my friend I have two important questions finally I could run your amazing code... as far as I know for doing that we need 4 kind of lists

1)dataset lists(mix_2_spk_tr and etc) 2) gender lists 3) wav lists
4) tfrecords lists for small scale and just for run the code I generated these lists by hand and the man_wav_list.py script but here is my two big problems :

1- how can I produce above lists specially dataset lists by script? do you have any script to do that?

2) for example in mix_2_spk_tr we have pretty much line of mixing different wav files in different SNRs to generate mixing's train dataset, my question is :

the mixing code automatically convert the wav files to target SNRs or before it we have to do that to make that list? for example we have this in first line of mix_2_spk_tr : /home/disk1/snsun/Workspace/tensorflow/kaldi/data/wsj0/tr/40na010x.wav 1.9857 /home/disk1/snsun/Workspace/tensorflow/kaldi/data/wsj0/tr/01xo031a.wav -1.9857

this script create_wav_2speakers.m automatically produce wavs with these SNRs (1.9857 and -1.9857 ) and then mix them for making the SNR or before that we have to produce such kind of wavs then run that script for making dataset?