Closed sava-1729 closed 1 month ago
@sava-1729 the dataset sample size is too small and the timings too minimal (few 100 milliseconds) to see any significant speedups. cuML HDBSCAN, and other cuML algorithms in general, will start showing speedups as the dataset size increases to real-life sized datasets. You can try 40,000 or 400,000 or 4,000,000 samples and let us know if you do not see any speedups. For now, I will close the issue.
Describe the bug hdbscan library's fit_predict produces faster output. How to get GPU acceleration?
Steps/Code to reproduce bug Try running this code (Requirements: cupy, numba, hdbscan)
With the current default pointcloud size, I get the following output:
Expected behavior I would expect the cuML's GPU accelerated clustering to be much faster than the normal CPU based one.
Environment details (please complete the following information):