Summary

When using a simple IndexIVFFlat on GPU, increasing nprobe result in memory errors. I do not understand this blow-up of memory as from my understanding, the index should juste brute-force search in more clusters as we increase nprobe. Moreover, I am using an NVIDIA A100 GPU with 40GB memory, and the data itself is about 8-9 GB (thus memory does not seem to be the main constraint in my case, but perhaps I am wrong ?).

Platform

OS: Ubuntu 20.04.1 LTS

Faiss version: faiss-gpu 1.7.2

Installed from: conda, pytorch channel

Running on:

[ ] CPU
[x] GPU

Interface:

[ ] C++
[x] Python

Reproduction instructions

Consider the following script :

features_db = np.random.rand(500000, 4096).astype('float32')
features_query = np.random.rand(40000, 4096).astype('float32')
d = features_db.shape[1]
nlist = int(10*np.sqrt(features_db.shape[0]))
factory_string = f'IVF{nlist},Flat'

index = faiss.index_factory(d, factory_string, faiss.METRIC_L2)
index = faiss.index_cpu_to_all_gpus(index)

index.train(features_db)
index.add(features_db)
index.nprobe = 1
D, I = index.search(features_query, 1)

Running this script works fine. However, when replacing

index.nprobe = 1

by a larger value, e.g.

index.nprobe = 200

it produces the following error :

Traceback (most recent call last): File "repro_bug.py", line 51, in D, I = index.search(features_query, 1) File "/cluster/raid/home/cyril.vallez/miniconda3/envs/faiss/lib/python3.8/site-packages/faiss/init.py", line 322, in replacement_search self.search_c(n, swig_ptr(x), k, swig_ptr(D), swig_ptr(I)) File "/cluster/raid/home/cyril.vallez/miniconda3/envs/faiss/lib/python3.8/site-packages/faiss/swigfaiss_avx2.py", line 9009, in search return _swigfaiss_avx2.GpuIndex_search(self, n, x, k, distances, labels) RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /root/miniconda3/conda-bld/faiss-pkg_1641228905850/work/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type TemporaryMemoryOverflow dev 0 space Device stream 0x563614fca640 size 26843545600 bytes (cudaMalloc error out of memory [2])

facebookresearch / faiss

Memory overflow when increasing nprobe in IndexIVFFlat #2338

Summary

Platform

Reproduction instructions