lixilinx / IVA4Cocktail

Neural network density models for speech separation.
20 stars 7 forks source link

About microph num question #2

Closed qiansichong closed 3 years ago

qiansichong commented 3 years ago

I have a question, when the number of microphones is set to 4 or 5, the separation performance of the trained model is normal, but when it is set to 2, the separation performance is very poor

lixilinx commented 3 years ago

Thanks for question. I have similar observations. I suppose this is reasonable. When you learning the density prior on only mixtures of two sources, density of the mixture itself does not have the diversity to learn good enough speech models, i.e., the problem is not challenging enough. Ideally, we want the number of mics are large enough so that the initial mixtures sounds like bubble noises. This will force the model to learn some real stuff.

Let's consider the extreme case, the number of mic is 1. Then, it is clear that no meaningful information can be learned.