wq2012 / SpectralCluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.
https://google.github.io/speaker-id/publications/LstmDiarization/
Apache License 2.0
513 stars 73 forks source link

Spectral clustering is too slow & expensive when sequence is long #43

Closed wq2012 closed 2 years ago

wq2012 commented 2 years ago

Spectral clustering relies on eigen-decomposition, which is ~O(N^2.7). This is too expensive for long-form conversations.

In order to accelerate spectral clustering, we can make use of pre-clustering results and constrain the max size of input to spectral clustering.