Open andrewmackie opened 1 year ago
I've realised that when calling the kwargs from the command line, all of the kwarg values will be received as strings - some will need to be converted.
The most thorough method of doing this would probably be to:
Please let me know if you would like me to do this (I'm very happy for you to do it as well).
I have exposed the kwargs for all of the sklearn-based clustering algorithms so that they can be called from cluster_SC(), cluster_AHC(), Diarizer.diarize() and the command line.
All kwargs available in the sklearn algorithms should be available. I noted that you have some default values for kwargs and have retained those.
I haven't done comprehensive testing. I won't be offended if you want to change the way it is implemented.
FYI, the reason I did this was that 'arpack' eigen solver in sklearn.cluster.SpectralClustering falls over when attempting to cluster a large number (>2k) of embeddings. Using the 'lobpcg' eigen solver appears to address this problem, but the eigen_solver kwarg could not be set from Diarizer.diarize() - now it can.