When using a simple IndexIVFFlat on GPU, increasing nprobe result in memory errors. I do not understand this blow-up of memory as from my understanding, the index should juste brute-force search in more clusters as we increase nprobe. Moreover, I am using an NVIDIA A100 GPU with 40GB memory, and the data itself is about 8-9 GB (thus memory does not seem to be the main constraint in my case, but perhaps I am wrong ?).
Platform
OS: Ubuntu 20.04.1 LTS
Faiss version: faiss-gpu 1.7.2
Installed from: conda, pytorch channel
Running on:
[ ] CPU
[x] GPU
Interface:
[ ] C++
[x] Python
Reproduction instructions
Consider the following script :
features_db = np.random.rand(500000, 4096).astype('float32')
features_query = np.random.rand(40000, 4096).astype('float32')
d = features_db.shape[1]
nlist = int(10*np.sqrt(features_db.shape[0]))
factory_string = f'IVF{nlist},Flat'
index = faiss.index_factory(d, factory_string, faiss.METRIC_L2)
index = faiss.index_cpu_to_all_gpus(index)
index.train(features_db)
index.add(features_db)
index.nprobe = 1
D, I = index.search(features_query, 1)
Running this script works fine. However, when replacing
index.nprobe = 1
by a larger value, e.g.
index.nprobe = 200
it produces the following error :
Traceback (most recent call last):
File "repro_bug.py", line 51, in
D, I = index.search(features_query, 1)
File "/cluster/raid/home/cyril.vallez/miniconda3/envs/faiss/lib/python3.8/site-packages/faiss/init.py", line 322, in replacement_search
self.search_c(n, swig_ptr(x), k, swig_ptr(D), swig_ptr(I))
File "/cluster/raid/home/cyril.vallez/miniconda3/envs/faiss/lib/python3.8/site-packages/faiss/swigfaiss_avx2.py", line 9009, in search
return _swigfaiss_avx2.GpuIndex_search(self, n, x, k, distances, labels)
RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /root/miniconda3/conda-bld/faiss-pkg_1641228905850/work/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type TemporaryMemoryOverflow dev 0 space Device stream 0x563614fca640 size 26843545600 bytes (cudaMalloc error out of memory [2])
This is an unfortunate memory allocation issue, probably due to the unusually high dimension of the vectors. There is a very simple workaround, which is to batch the query vectors with smaller query batches.
Summary
When using a simple IndexIVFFlat on GPU, increasing nprobe result in memory errors. I do not understand this blow-up of memory as from my understanding, the index should juste brute-force search in more clusters as we increase nprobe. Moreover, I am using an NVIDIA A100 GPU with 40GB memory, and the data itself is about 8-9 GB (thus memory does not seem to be the main constraint in my case, but perhaps I am wrong ?).
Platform
OS: Ubuntu 20.04.1 LTS
Faiss version: faiss-gpu 1.7.2
Installed from: conda, pytorch channel
Running on:
Interface:
Reproduction instructions
Consider the following script :
Running this script works fine. However, when replacing
by a larger value, e.g.
it produces the following error :
Traceback (most recent call last): File "repro_bug.py", line 51, in
D, I = index.search(features_query, 1)
File "/cluster/raid/home/cyril.vallez/miniconda3/envs/faiss/lib/python3.8/site-packages/faiss/init.py", line 322, in replacement_search
self.search_c(n, swig_ptr(x), k, swig_ptr(D), swig_ptr(I))
File "/cluster/raid/home/cyril.vallez/miniconda3/envs/faiss/lib/python3.8/site-packages/faiss/swigfaiss_avx2.py", line 9009, in search
return _swigfaiss_avx2.GpuIndex_search(self, n, x, k, distances, labels)
RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /root/miniconda3/conda-bld/faiss-pkg_1641228905850/work/faiss/gpu/StandardGpuResources.cpp:452: Error: 'err == cudaSuccess' failed: StandardGpuResources: alloc fail type TemporaryMemoryOverflow dev 0 space Device stream 0x563614fca640 size 26843545600 bytes (cudaMalloc error out of memory [2])