RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
Currently everything is upconverted to fp32. Keep input precision. Use optimized k-means for the same precision / consider downconverting while subsampling.
From @tfeher:
Currently everything is upconverted to fp32. Keep input precision. Use optimized k-means for the same precision / consider downconverting while subsampling.
This addresses part of https://github.com/rapidsai/raft/issues/1675