rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.24k stars 534 forks source link

[FEA] Rewrite HDBSCAN’s mutual_reachability_knn_l2 with RAFT primitives #4882

Open cjnolet opened 2 years ago

cjnolet commented 2 years ago

HDBSCAN contains some code which was adapted directly from the brute force nn code in FAISS to compute a knn directly in reachability space using an array of core distances.

One of the major goals we have had for some time is to remove the FAISS dependency from RAPIDS (specifically cuml and raft) as much as possible to simplify our list of dependencies.

The goal here is to rewrite the logic in mutial_reachabilty_knn_l2 on RAFT primitives so that it no longer requires FAISS. Given how optimized the k-selection code is in FAISS, we may end up ultimately bringing that over to RAFT directly (by copying it out of FAISS but giving proper attribution), but the remaining bits should all be fairly straightforward to port to RAFT directly.

Ideally this would be fixed by https://github.com/rapidsai/raft/issues/798.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.