Closed dreamibor closed 4 years ago
Hi, I'd like to appreciate your question.
1) Way to create training data Training data is generated by choosing from ./dataset/train/noise/ and ./dataset/train/speech/* respectively. The 2 audio is simulated by chosen SNR and revereberent time randomly. In script "train.py", the simulated speech is generated without writing file in HDD(The more training data file, HDD disc capacity is insufficient).
2) Separete speech and noise data As you know, this approach needs parallel corpus(noise and speech). Research often uses CHiME corpus.
Regards,
Thank you for your response! I think your answer solved my problem and I will close the issue.
Hi, is there a way to create the training dataset? I mean the approach that you take to get seperate speech and noise data?