How to train model for more than 2 speakers?

Audio-WestlakeU / NBSS

The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation

MIT License

175 stars 21 forks source link

How to train model for more than 2 speakers? #19

Open Sonish-Maharjan-2014 opened 6 months ago

Sonish-Maharjan-2014 commented 6 months ago

I trained the model (form NBSS) branch for 2 speakers separation using wsj0 dataset. It perfectly worked. But now I want to train the model for more than 2 speakers. What steps should I follow?

quancs commented 5 months ago

Hello, thank you for your insterests in our works. To train the model for a dataset where each utterance more than 2 speakers, you can change the number of output channels to 2N (N speakers and each has 2 numbers for the real and imginary parts of STFT coefficients) for each TF-bin.

Sonish-Maharjan-2014 commented 5 months ago

Thank you for your response.. I tried adapting the code for four speakers. I generated room impulse responses (RIR) for the four speakers and made some adjustments in the code. Unfortunately, I ran into an error towards the end of the process. Could you help me fix the problem?

quancs commented 5 months ago

You can debug your code to check the shape of echoics, echoic_i, and the value of needed_lens

Sonish-Maharjan-2014 commented 5 months ago

Thanks, I will try to debug