rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.25k stars 534 forks source link

[TASK][FEA] Setting `build_algo=auto` as default for UMAP #5985

Open jinsolp opened 3 months ago

jinsolp commented 3 months ago

Description

The default build algo for building the knn graph in UMAP is set to auto, which decides which algorithm to use to build the knn graph depending on the given dataset size (<= 50K uses brute force knn, >50K uses nn descent). Other options for build_algo are brute_force_knn and nn_descent, which uses that knn building algorithm without looking at the data size. Related PR

However, to keep consistency for users who expect the same results as the previous release, we will use brute_force_knn to build the knn graph for the following case regardless of the data size.

TODO

Go through a deprecation cycle to let build_algo be auto (i.e. let the build algo be decided during runtime based on the data size) if given as auto regardless of whether random_state is explicitly given or not.