[FEA] Support reduced precision in IVF-PQ index building

rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.

https://docs.rapids.ai/api/raft/stable/

Apache License 2.0

781 stars 195 forks source link

[FEA] Support reduced precision in IVF-PQ index building #1893

Open cjnolet opened 1 year ago

cjnolet commented 1 year ago

From @tfeher:

Currently everything is upconverted to fp32. Keep input precision. Use optimized k-means for the same precision / consider downconverting while subsampling.

This addresses part of https://github.com/rapidsai/raft/issues/1675

tfeher commented 1 year ago

There are two separate tasks here, which can be solved independently:

Enable handling fp16 input dataset for IVF-PQ.
Do not upconvert fp16 when subsampling the index (see the issue description above). This depends on #1892