[BUG] PQ implementation collapses all subspaces

rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU

https://rapids.ai

Apache License 2.0

229 stars 68 forks source link

[BUG] PQ implementation collapses all subspaces #221

Open jbellis opened 4 months ago

jbellis commented 4 months ago

It's possible that I'm missing something due to unfamiliarity with the codebase, but it looks to me like vpq_dataset and train_pq are collapsing all the subspaces into a single codebook. E.g. if you have pq_n_centers=256, pq_len=8, and dim=128, a classic PQ implementation would result in a codebook of 256*8*32, with centers for each 8-dim subspace trained separately. Instead, cuvs is training just 256*8 which probably works okay for extremely symmetrical datasets but will diminish accuracy unnecessarily with others.

achirkin commented 4 months ago

You're right. vpq_dataset and train_pq are parts of the CAGRA implementation. In contrast to IVF-PQ, the implementation of compression in CAGRA uses just one codebook for all subspaces, by design. As far as I understand, this greatly reduces the GPU shared memory / cache requirements and, through that, improves the performance. The product quantization in vpq_dataset is applied on the residuals of the vector quantization (k-means cluster centers); I assume the drop in recall is partially mitigated by increasing the number of vectors (parameter vq_n_centers).

jbellis commented 4 months ago

Thanks! I had not noticed that cagra_q_dataset_descriptor_t::set_smem_ptr assumes that the entire codebook needs to fit into shared memory. Given that constraint, the "collapsing" here makes sense.