Then executed the faiss baseline script and running into the below issue:
/big-ann-benchmarks/neurips21/track3_baseline_faiss$ python3 gpu_baseline_faiss.py --dataset msspacev-1M --indexkey IVF65536,SQ8 --train_on_gpu --build --quantize
r_on_gpu_add --add_splits 30 --search --searchparams nprobe={1,4,16,64,256} --parallel_mode 3 --quantizer_on_gpu_search
nb processors 64
model name : Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz
Dataset MSSPACEV1B in dimension 100, with distance euclidean, search_type knn, size: Q 29316 B 1000000
build index, key= IVF65536,SQ8
Build-time number of threads: 64
Update add-time parameters
setting maxtrain to 3276800
getting first 3276800 dataset vectors for training
data/MSSPACEV1B/spacev1b_base.i8bin
train, size (1000000, 100)
add a training index on GPU
Training level-1 quantizer
Training level-1 quantizer on 1000000 vectors in 100D
WARNING clustering 1000000 points to 65536 centroids: please provide at least 2555904 training points
Clustering 1000000 points in 100D to 65536 clusters, redo 1 times, 10 iterations
Preprocessing in 0.28 s
Iteration 9 (6.14 s, search 5.43 s): objective=6.03294e+09 imbalance=1.691 nsplit=0
Training IVF residual
Input training set too big (max size is 100000), sampling 100000 / 1000000 vectors
Total train time 50.826 s
adding
============== SPLIT 0/30
data/MSSPACEV1B/spacev1b_base.i8bin
adding 0:33333 / 1000000 [0.223 s, RSS 3757956 kiB]
Traceback (most recent call last):
File "/mnt/ssd_volume/big-ann-benchmarks/neurips21/track3_baseline_faiss/gpu_baseline_faiss.py", line 591, in <module>
main()
File "/mnt/ssd_volume/big-ann-benchmarks/neurips21/track3_baseline_faiss/gpu_baseline_faiss.py", line 543, in main
index = build_index(args, ds)
File "/mnt/ssd_volume/big-ann-benchmarks/neurips21/track3_baseline_faiss/gpu_baseline_faiss.py", line 210, in build_index
index.add_core(
File "/mnt/ssd_volume/mumba1/lib/python3.10/site-packages/faiss-1.8.0-py3.10.egg/faiss/swigfaiss.py", line 6956, in add_core
return _swigfaiss.IndexIVFScalarQuantizer_add_core(self, n, x, xids, precomputed_idx, inverted_list_context)
TypeError: Wrong number or type of arguments for overloaded function 'IndexIVFScalarQuantizer_add_core'.
Possible C/C++ prototypes are:
faiss::IndexIVFScalarQuantizer::add_core(faiss::idx_t,float const *,faiss::idx_t const *,faiss::idx_t const *,void *)
faiss::IndexIVFScalarQuantizer::add_core(faiss::idx_t,float const *,faiss::idx_t const *,faiss::idx_t const *)
this appeared to be an issue with the faiss swig interface not being able to properly cast from the numpy datatype passed to the faiss library. Hence, tried to manually cast them to satisfy the compiler.
I'm trying to repro the faiss GPU baseline results with MSSPACEV dataset. I'm following the instructions using the [FAISS T3 Baseline] (https://github.com/harsha-simhadri/big-ann-benchmarks/tree/main/neurips21/track3_baseline_faiss).
Built FAISS using source from the main branch.
Then executed the faiss baseline script and running into the below issue:
this appeared to be an issue with the faiss swig interface not being able to properly cast from the numpy datatype passed to the faiss library. Hence, tried to manually cast them to satisfy the compiler.
The compiler errors are gone, but running into the below runtime exception.