philipperemy / deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.
MIT License
905 stars 241 forks source link

[feature request] Could it support telling the speakers count in a noisy environment #101

Open thelou1s opened 1 year ago

thelou1s commented 1 year ago

[feature request] Could it support telling the speakers count in a noisy environment (like a 10 minutes meeting audio, i.e. many people may speak at the same time), or could you give me some advice how to implements it? thank you :)

philipperemy commented 1 year ago

@thelou1s That's a good question. You would just need to have access to data in a noisy environment. It's always possible to add/multiply noise to the data "on the fly" and train the model that way. But the reality is usually a lot more complicated than this. Baidu did it well because they hired hundreds of people sitting in a room to generate data.