fjodborg commented 3 years ago

Summary

I get some weird errors and segmentation faults when running the demos/tests. I've written all the steps i've made in the reproduction instructions. I'm using a 1650 GTX, cuda-11.2, python3.7 and openblas. The memory usage never goes above 1700MiB/3908MiB when running the tests.

Platform

OS: Ubuntu 18.04

Faiss version: v1.7.0 with 5efe1a97323a3e327b9058d57a54d4469ef6baad applied

Installed from: source with cmake 3.18.6

Faiss compilation options: cmake -B build . -DFAISS_ENABLE_GPU=ON \ -DFAISS_ENABLE_PYTHON=ON \ -DCMAKE_BUILD_TYPE=Release \ -DCUDAToolkit_ROOT=/usr/local/cuda-11.2 \ -DCMAKE_CUDA_ARCHITECTURES="75;72" \ -DPython_EXECUTABLE=/usr/bin/python3.7 \ Running on:

[ ] CPU
[X] GPU

Interface:

[x] C++
[x] Python

Reproduction instructions

This is the exact commands i used to install faiss. I had to rename /usr/bin/python to something else otherwise it would use python3.7 as executable and python2.7 as pythonInterp.

cd $HOME/repos
git clone https://github.com/facebookresearch/faiss
cd faiss/faiss
git checkout tags/v1.7.0
wget https://github.com/facebookresearch/faiss/pull/1245/commits/5efe1a97323a3e327b9058d57a54d4469ef6baad.diff --output-document=gpu_fix.patch
patch -p1 < gpu_fix.patch --force
cd $HOME/repos/faiss
mkdir build
PYTHON_WRONG=$(which python)
[ -f $PYTHON_WRONG ] && sudo mv $PYTHON_WRONG "${PYTHON_WRONG}.back"
PYTHON_BIN=$(which python3.7)
cmake -B build . -DFAISS_ENABLE_GPU=ON \
                -DFAISS_ENABLE_PYTHON=ON \
                -DCMAKE_BUILD_TYPE=Release \
                -DCUDAToolkit_ROOT=/usr/local/cuda \
                -DCMAKE_CUDA_ARCHITECTURES="75;72" \
                -DPython_EXECUTABLE=$PYTHON_BIN \
                #-DBLA_VENDOR=Intel10_64_dyn 
make -C build -j faiss
make -C build -j swigfaiss
cd build/faiss/python
sudo python3.7 setup.py build
sudo python3.7 setup.py install
cd $HOME/repos/faiss
sudo make -C build install
sudo mv "${PYTHON_WRONG}.back" $PYTHON_WRONG

When running make -C build test everything passes except 1:

        Start  84: TestGpuMemoryException.AddException
        84/127 Test  #84: TestGpuMemoryException.AddException ....................Child aborted***Exception:   0.58 sec

When building and running ./build/faiss/gpu/test/demo_ivfpq_indexing_gpu i get the following:

[0.566 s] Generating 100000 vectors in 128D for training
[0.762 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
  Preprocessing in 0.01 s
  Iteration 9 (0.35 s, search 0.32 s): objective=930821 imbalance=1.244 nsplit=0            
  Input training set too big (max size is 65536), sampling 65536 / 100000 vectors
computing residuals
training 4 x 256 product quantizer on 65536 vectors in 128D
Training PQ slice 0/4
Clustering 65536 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (2.31 s, search 1.74 s): objective=113036 imbalance=1.004 nsplit=0       
Training PQ slice 1/4
Clustering 65536 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.86 s, search 1.52 s): objective=113319 imbalance=1.004 nsplit=0       
Training PQ slice 2/4
Clustering 65536 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (2.78 s, search 2.01 s): objective=113199 imbalance=1.004 nsplit=0       
Training PQ slice 3/4
Clustering 65536 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (2.86 s, search 2.01 s): objective=113167 imbalance=1.004 nsplit=0       
[12.632 s] storing the pre-trained index to /tmp/index_trained.faissindex
[12.633 s] Building a dataset of 200000 vectors to index
[13.036 s] Adding the vectors to the index
Faiss assertion 'err__ == cudaSuccess' failed in void faiss::gpu::runTransposeAny(faiss::gpu::Tensor<OtherT, OtherDim, true, int, faiss::gpu::traits::DefaultPtrTraits>&, int, int, faiss::gpu::Tensor<OtherT, OtherDim, true, int, faiss::gpu::traits::DefaultPtrTraits>&, cudaStream_t) [with T = float; int Dim = 3; cudaStream_t = CUstream_st*] at /home/fjod/repos/faiss/faiss/gpu/utils/Transpose.cuh:207; details: CUDA error 9 invalid configuration argument
Aborted (core dumped)

When running python3.7 demos/demo_auto_tune.py

load data
load GT
prepare criterion
============ key PCA64,IVF4096,Flat
[0.866 s] train & add
WARNING clustering 100000 points to 4096 centroids: please provide at least 159744 training points
[2.959 s] explore op points
  0/12: cno=0 nprobe=1 bounds [perf<=1.000 t>=0.000]  perf 0.347 t 0.018 (1 run) *
  1/12: cno=11 nprobe=2048 bounds [perf<=1.000 t>=0.018] Traceback (most recent call last):
  File "demos/demo_auto_tune.py", line 143, in <module>
    opi = params.explore(index, xq, crit)
  File "/usr/local/lib/python3.7/dist-packages/faiss-1.7.0-py3.7.egg/faiss/__init__.py", line 375, in replacement_explore
    crit, ops)
  File "/usr/local/lib/python3.7/dist-packages/faiss-1.7.0-py3.7.egg/faiss/swigfaiss.py", line 9184, in explore
    return _swigfaiss.ParameterSpace_explore(self, index, nq, xq, crit, ops)
RuntimeError: Error in virtual void* faiss::gpu::StandardGpuResourcesImpl::allocMemory(const faiss::gpu::AllocRequest&) at /home/fjod/repos/faiss/faiss/gpu/StandardGpuResources.cpp:443: Error: 'err == cudaSuccess' failed: Failed to cudaMalloc 5242880000 bytes on device 0 (error 2 out of memory
Outstanding allocations:
Alloc type TemporaryMemoryBuffer: 1 allocations, 536870912 bytes
Alloc type FlatData: 2 allocations, 1064960 bytes
Alloc type Other: 5 allocations, 178400000 bytes
Alloc type IVFLists: 8190 allocations, 432721504 bytes

fjodborg commented 3 years ago

Tried out v1.6.5 with the patch and it worked ./build/faiss/gpu/test/demo_ivfpq_indexing_gpu. So i guess there is a problem with 1.7.0 and the master branch?

fjodborg commented 3 years ago

It was solved by just using a previous version.

petrsmid commented 3 years ago

We have the same problem. Using older version is not a solution for us. Could you please reopen?

fjodborg commented 3 years ago

We have the same problem. Using older version is not a solution for us. Could you please reopen?

I haven't tried it myself but have you tried with a older version of cuda e.g cuda 10.2? I've experienced many errors due to 11.2.

petrsmid commented 3 years ago

I haven't tried it myself but have you tried with a older version of cuda e.g cuda 10.2? I've experienced many errors due to 11.2.

We tried the cuda 10.2 but experience the same problem.

facebookresearch / faiss

CUDA error 9 invalid configuration argument when running demo_ivfpq_indexing_gpu and more #1771

Summary

Platform

Reproduction instructions