[FEA] CAGRA-Q support pq_len=16 and pq_len=8

rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU

https://rapids.ai

Apache License 2.0

213 stars 67 forks source link

[FEA] CAGRA-Q support pq_len=16 and pq_len=8 #287

Open tfeher opened 3 months ago

tfeher commented 3 months ago

Currently CAGRA supports PQ compression with pq_len=2 ad pq_len=4. A larger compression ratio can be achieved if we allow larger pq_len values, e.g. 8 and 16.

pq_len is a template parameter of the distance computation kernels. In the current setting we would need to instantiate new kernels for the larger pq_len values. This would add ~ 160 MB to the binary size.

A preferred solution would be to make pq_len a runtime parameter.

achirkin commented 2 months ago

I'm investigating the options to factor out the distance computation from the kernel, so that an abstract "distance core" is passed to search kernels by pointers. This should reduce the overall binary size and the impact of increasing the number of distance core instances. I am not sure yet what will be the impact on the performance though.