Summary

How to use GpuIndexIVFFlat to build index with dtype float16 (just flat IV Index, without using quantization)?

Platform

OS: Ubuntu 20.04.6

Faiss version: 1.7.2

Installed from: anaconda

Faiss compilation options:

Running on:

[ ] CPU
[x] GPU

Interface:

[ ] C++
[x] Python

Reproduction instructions

I find https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU mention that "float16 or float32 precision options affect the storage of data in the database (as with GpuIndexIVFFlat)", I think it means that I can use float16 in faiss.GpuIndexIVFFlat() as the datatype of all vectors. But in https://faiss.ai/cpp_api/struct/structfaiss_1_1gpu_1_1GpuIndexIVFFlatConfig.html there is no useFloat16 option. So I'm wondering whether I could use float16 in GPUIndexIVFFlat. Below is my code: ```python res = faiss.StandardGpuResources() config = faiss.GpuIndexIVFFlatConfig() config.device = 0 nlist = 5000 nprobe = 50 index = faiss.GpuIndexIVFFlat(res, embed_size, nlist, faiss.METRIC_L2, flat_config) faiss_index.train(db) faiss_index.add(db) ``` db is a float32 tensor with shape (2806840, 2112), embed_size=2112. The total size of db is `2.8M * 2112 * 4B = 23 GB` and it just exceed my single 4090 24GB's capacity if I also load a model into GPU, so I want to build faiss index in float16 instead of float32. I have tried to add `config.flatConfig.useFloat16=True` but not work, and I don't want to use quantization because I will lose much precision when I test on CPU. I also tried to use `config.memoryShare = faiss.Device` (I'm not sure if it's the right way to use it), it says 'On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.' and not work too. Could you please tell me what to do?

facebookresearch / faiss

How to use float16 in GpuIndexIVFFlat() #3957

Summary

Platform

Reproduction instructions