[FEA] CAGRA-Q to quantize before graph build

rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU

Apache License 2.0

226 stars 68 forks source link

Indeed, when we use IVF-PQ build method for the KNN graph, then we do a PQ quantization for the graph building, and another one for compressing the data for CAGRA search. There are a few reasons why we do that:

Graph building has stronger compression. It has separate codebooks for each subspace or cluster.
CAGRA-Q uses only a single codebook so that it fits shared memory
The memory layout of the compressed dataset is different.

Because of this, I think it is justified to have separate product quantization steps for the build algo and the CAGRA-Q encoding. Before the product quantization, we run vector quantization: cluster the dataset, assign cluster centers to each vector. (Only the difference between cluster center and database vector will be product quantized). The vector quantization step could work with same parameters in both case, therefore we could reuse the clustering at these two stages.

rapidsai / cuvs

[FEA] CAGRA-Q to quantize before graph build #233