aishoot / LSTM_PIT_Speech_Separation

Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.
306 stars 90 forks source link

Could not find results for multi-speaker audio files #2

Open akshayaCap opened 5 years ago

akshayaCap commented 5 years ago

Hi, I was going through your repository. I could not find results of LSTM and BLSTM on the 2 speaker .wav (audio files) generated by you. Can you please add them?

Also, have you tried this algorithm with multiple speaker with added noise? If yes, can you share the results?

aishoot commented 5 years ago

@akshayaCap Hello, I have uploaded .wav results of 2 speakers - "6-separated_result_BLSTM" and "7-separated_result_LSTM". As you say "with multiple speaker with added noise", one speaker can be regarded as target speaker while other speakers can be viewed as noise. And the algorithms are in first two folders.

akshayaCap commented 5 years ago

@pchao6 thanks for your reply. Input files are missing in these folders. Can you please add them?

aishoot commented 5 years ago

@akshayaCap I'm sorry for sharing the input files. The input dataset WSJ0 needs paid license. There's the WSJ0 corpus website where you can purchase.

akshayaCap commented 5 years ago

@pchao6 thank-you for clarification.

  1. Can you please share results on vctk-corpus as it is a freely available dataset.
  2. Also, is it possible for you to share script for inference on any noisy .wav file for a particular sampling rate.
aishoot commented 5 years ago

@akshayaCap Thanks for your interest in my work. Fisrtly, I haven't done a separation experiment on the VCTK dataset, but you can try it. Secondly, you can just replace one of the two speakers' .wav files with noise data when creating a mixed data set. Other experiment settings, such as codes, are the same. You can try it.