rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU
https://rapids.ai
Apache License 2.0
226 stars 68 forks source link

[FEA] Product quantization API #107

Open cjnolet opened 6 months ago

cjnolet commented 6 months ago

We need a separate product quantization API that is decoupled from IVF but can still be composed into IVF.

Ideally this API would follow FAISS or Scikit-learn'a transformer estimators.

QDXG-CXK commented 1 week ago

Hi! I'm really interested in helping to implement this feature.

From what I understand, the request involves:

  1. Extracting the product quantization code from ivf-pq and exposing public API.
  2. Redesigning the ivf-pq to use the PQ API, eliminating duplicated code.
  3. [Possibly in the future] Creating a base class for ivf that can be flexibly combined with SQ, PQ, BinaryQ and other quantization methods. (related to issues #106 and #139 )

Is there a concrete plan for this yet? I noticed that some discussion has taken place in #211 ,but I didn't find a clear conclusion there. As mentioned in this comment and this one, the PQ API should at least cover training, encoding and searching. It seems like CuVS prefers a stateless style rather than holding trained quantizer in a index like FAISS does.

Looking forward to collaborating on this!

QDXG-CXK commented 2 days ago

Hi Maintainers! I'm still very interested in contributing to this feature and would appreciate any feedback.