rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU
https://rapids.ai
Apache License 2.0
210 stars 67 forks source link

[FEA] Support int64_t nearest neighbor index type for all ANN algorithm. #156

Open tfeher opened 5 months ago

tfeher commented 5 months ago

Currently most of our the nearest neighbors methods support "int64_t" type for the nearest neighbor indices. The only exception is CAGRA, that returns uint32_t indices. For consistency, it would be great to support int64_t for CAGRA as well, and it can be done with a small overhead by adding a simple mapping after the search.

Similarly, it would be easy to add uint32_t support for all the other nearest neighbor algorithms, if there is a need for that.

tfeher commented 1 month ago

To clarify: this issue aims to add another overload to search that supports neighbors argument as

raft::device_matrix_view<int64_t, int64_t, raft::row_major> neighbors,

Simplest way to achive this to add a simple mapping, like we do in our benchmarks: https://github.com/rapidsai/cuvs/blob/7d144cf1285f7113cfbeed68dda5362efb8f2657/cpp/bench/ann/src/cuvs/cuvs_cagra_wrapper.h#L320-L324

At the same time, there is a strong motivation to keep IdxT of the kNN graph as uint32_t in order to minimize the memory footprint of the KNN graph. The graph size is n_vectors * graph_degre * sizeof(IdxT).