How to use GpuIndexIVFFlat to build index with dtype float16 (just flat IV Index, without using quantization)?
Platform
OS: Ubuntu 20.04.6
Faiss version: 1.7.2
Installed from: anaconda
Faiss compilation options:
Running on:
[ ] CPU
[x] GPU
Interface:
[ ] C++
[x] Python
Reproduction instructions
I find https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU mention that "float16 or float32 precision options affect the storage of data in the database (as with GpuIndexIVFFlat)", I think it means that I can use float16 in faiss.GpuIndexIVFFlat() as the datatype of all vectors. But in https://faiss.ai/cpp_api/struct/structfaiss_1_1gpu_1_1GpuIndexIVFFlatConfig.html there is no useFloat16 option. So I'm wondering whether I could use float16 in GPUIndexIVFFlat. Below is my code:
```python
res = faiss.StandardGpuResources()
config = faiss.GpuIndexIVFFlatConfig()
config.device = 0
nlist = 5000
nprobe = 50
index = faiss.GpuIndexIVFFlat(res, embed_size, nlist, faiss.METRIC_L2, flat_config)
faiss_index.train(db)
faiss_index.add(db)
```
db is a float32 tensor with shape (2806840, 2112), embed_size=2112. The total size of db is `2.8M * 2112 * 4B = 23 GB` and it just exceed my single 4090 24GB's capacity if I also load a model into GPU, so I want to build faiss index in float16 instead of float32. I have tried to add `config.flatConfig.useFloat16=True` but not work, and I don't want to use quantization because I will lose much precision when I test on CPU.
I also tried to use `config.memoryShare = faiss.Device` (I'm not sure if it's the right way to use it), it says 'On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.' and not work too.
Could you please tell me what to do?
Summary
How to use GpuIndexIVFFlat to build index with dtype float16 (just flat IV Index, without using quantization)?
Platform
OS: Ubuntu 20.04.6
Faiss version: 1.7.2
Installed from: anaconda
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
I find https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU mention that "float16 or float32 precision options affect the storage of data in the database (as with GpuIndexIVFFlat)", I think it means that I can use float16 in faiss.GpuIndexIVFFlat() as the datatype of all vectors. But in https://faiss.ai/cpp_api/struct/structfaiss_1_1gpu_1_1GpuIndexIVFFlatConfig.html there is no useFloat16 option. So I'm wondering whether I could use float16 in GPUIndexIVFFlat. Below is my code: ```python res = faiss.StandardGpuResources() config = faiss.GpuIndexIVFFlatConfig() config.device = 0 nlist = 5000 nprobe = 50 index = faiss.GpuIndexIVFFlat(res, embed_size, nlist, faiss.METRIC_L2, flat_config) faiss_index.train(db) faiss_index.add(db) ``` db is a float32 tensor with shape (2806840, 2112), embed_size=2112. The total size of db is `2.8M * 2112 * 4B = 23 GB` and it just exceed my single 4090 24GB's capacity if I also load a model into GPU, so I want to build faiss index in float16 instead of float32. I have tried to add `config.flatConfig.useFloat16=True` but not work, and I don't want to use quantization because I will lose much precision when I test on CPU. I also tried to use `config.memoryShare = faiss.Device` (I'm not sure if it's the right way to use it), it says 'On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.' and not work too. Could you please tell me what to do?