Thank you for uploading pre-trained ECAPA-TDNN model.
For speaker diarization, the spectral clustering algorithm used by wespeaker uses the p-neighbor binarization scheme, and "p" should be choosed by people. I want to know how to choose "p" for different dataset(such as AMI, DIHARD, MagicData, Callhome or AISHELL4), 0.01 is ok?
In "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap", author proposed NME-SC, the algorithm free us for choosing "p". I want to know if wespeaker can Implement the algorithm?
I think there isn't a fixed "p" can perform well in all datasets as you mention, which is exactly why the NME-SC algorithm is proposed ans works. In my experience, "p" in [0.01, 0.05] would get a modest result. Also, you can refer to our setup in our diarization recipe.
This algorithm is essentially enumerating the "p" value and find the best in the dev set, which is costly in computation. You can easily implement it from our diarization codes by adding a for loop of "p". Maybe you can contribute the codes when you finish it!
Thank you for uploading pre-trained ECAPA-TDNN model.
For speaker diarization, the spectral clustering algorithm used by wespeaker uses the p-neighbor binarization scheme, and "p" should be choosed by people. I want to know how to choose "p" for different dataset(such as AMI, DIHARD, MagicData, Callhome or AISHELL4), 0.01 is ok?
In "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap", author proposed NME-SC, the algorithm free us for choosing "p". I want to know if wespeaker can Implement the algorithm?