rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.12k stars 527 forks source link

[FEA] Python API for Spectral Clustering / Embedding #263

Open cjnolet opened 5 years ago

cjnolet commented 5 years ago

UMAP uses a sparse spectral clustering function for one of its initialization strategies. UMAP creates a sparse adjacency matrix from the weighted nearest neighbors, but a more general-use implementation of the spectral clustering algorithm would use the epsilon neighborhoods or a knn graph to create the sparse adjacency matrix for the spectral computations.

The spectral embedding currently in raft implements the lanczos solver to compute both the n smallest/largest eigenpairs and the cluster labels, so it can be used for both the spectral embedding and clustering. Raft also now contains code to compute the knn graph so we have the basic components for initial functionality.

nithinraok commented 3 years ago

Hi Corey, thanks for the update on this. I am looking for python api to move from sklearn's cpu clustering code implementation to GPU.

trivialfis commented 3 years ago

@cjnolet It seems all we need to do is to expose https://github.com/rapidsai/cuml/blob/branch-21.06/cpp/src/spectral/spectral.cu into Python?

cjnolet commented 3 years ago

@trivialfis, the spectral embedding / clustering solver in RAFT expects a connectivites graph as input, from which it will construct the graph Laplacian and then compute the eigenpairs. For example, UMAP uses the fuzzy union as the connectivities graph to initialize its embeddings. Exposing the current spectral embedding/clustering solver through Python will require constructing a connectivities graph, either through a knn graph (https://github.com/rapidsai/raft/blob/branch-21.06/cpp/include/raft/sparse/selection/knn_graph.cuh#L94) or constructing an RBF kernel (https://github.com/rapidsai/cuml/blob/branch-0.20/cpp/src_prims/matrix/kernelfactory.cuh#L28).

trivialfis commented 3 years ago

Thanks for the reply. I will try to support both user supplied knn graph and building knn graph inside spectral function.

nithinraok commented 3 years ago

Hi, do you any updates on this please?

trivialfis commented 3 years ago

I'm working on it (among other things). ;-) Should be able a provide a PR soon.

cjnolet commented 3 years ago

@trivialfis, can you assign this issue to yourself since you've been working on it?

trivialfis commented 3 years ago

Done.