Open thelou1s opened 1 year ago
@thelou1s That's a good question. You would just need to have access to data in a noisy environment. It's always possible to add/multiply noise to the data "on the fly" and train the model that way. But the reality is usually a lot more complicated than this. Baidu did it well because they hired hundreds of people sitting in a room to generate data.
[feature request] Could it support telling the speakers count in a noisy environment (like a 10 minutes meeting audio, i.e. many people may speak at the same time), or could you give me some advice how to implements it? thank you :)