BUTSpeechFIT / VBx

Variational Bayes HMM over x-vectors diarization
251 stars 57 forks source link

Limit number of output speakers #58

Closed AlexandderGorodetski closed 1 year ago

AlexandderGorodetski commented 1 year ago

Hello,

Is it possible to limit number of output speaker to two ?

Thanks, Alex.

fnlandini commented 1 year ago

Hi Alex,

VBx can find the number of speakers but it is not possible to force it to converge to a specific amount. You can use the hyperparameter Fb to bias the model to find less speakers (by using larger values) but the model will still have freedom to find the number of speakers that it believes there are. This approach can still lead to mistakes. Perhaps, you might be better off if you use agglomerative hierarchical clustering and set the number of clusters to 2. You would need to change the call https://github.com/BUTSpeechFIT/VBx/blob/7dfa3fab81c36b144cde9647fc80d96c0261a772/VBx/vbhmm.py#L145-L146 with maxclust if I'm not mistaken.

Regards, Federico

fnlandini commented 1 year ago

Closing due to inactivity. Feel free to reopen if you see fit.