Closed ThanhPham1987 closed 4 years ago
Hi Peter, the real data we used to train the network is not very high quality compared to the standard in the community. We couldn't use existing datasets because the data and network are specific to a certain microphone array configuration.
We had planned to go to a sound studio at the University and record dozens of different real speakers in a soundproof chamber and then gather lots of background noise in the wild. Because of Covid, I could only record voices and background played over a loudspeaker in my basement. The ground truth signals are not perfectly clean and the positions are not completely accurate. From a scientific perspective, it makes more sense to do experiments on a synthetically rendered dataset (see generate_dataset.py). I can look at uploading the real data in the near future, but I strongly recommend people to fine-tune on data recorded by real speakers in a more controlled environment using their specific microphone array. This is something we might look into once our lab is re-opened.
Hi @vivjay30 Thanks for your answer. Would you like to tell some device that you are recording real speaker? Best regards, PeterPham
Hi @vivjay30 , Thanks for your sharing - nice work!. Would you like to share your dataset for training? Best regards, PeterPham