If faiss is imported before torch - the returned KNNs from a GPU index are very bad. If we switch the order of imports, the returned KNNs are much more accurate.
I load the same index that I always use for the KNN-LM paper (https://github.com/urvashik/knnlm), and then I convert it to a gpu-index, while keeping both the cpu and gpu indices.
Then I’m randomly creating 1000 numpy queries, and searching for the NNs in both indices.
At the first search, the average overlap between CPU- and GPU- indices is ~270 on average (out of k=1024 nearest neighbors!), which is surprisingly low.
If I re-run gpu_index.search(queries, k) using the same queries and same k - the average overlap between the GPU index and the CPU index goes down to ~12 out of 1024.
That is, not only that the overlap between CPU and GPU is low (270/1024), re-searching using the same queries gives totally worse neighbors overlap!
import faiss
import torch
import numpy as np
num_queries = 1000
k = 1024
indexfile = 'my_knn.index'
cpu_index = faiss.read_index(indexfile, faiss.IO_FLAG_ONDISK_SAME_DIR)
cpu_index.nprobe = 32
co = faiss.GpuClonerOptions()
co.useFloat16 = True
gpu_index = faiss.index_cpu_to_gpu(faiss.StandardGpuResources(), 0, cpu_index, co)
gpu_index.nprobe = 32
queries = np.random.random((num_queries, 1024)).astype(np.float32)
cpu_dists, cpu_knns = cpu_index.search(queries, k)
gpu_dists, gpu_knns = gpu_index.search(queries, k)
# print average overlap
print(np.mean([len(set(cpu_knns[i]).intersection(set(gpu_knns[i]))) for i in range(num_queries)]))
# Search again in the GPU index
gpu_dists, gpu_knns = gpu_index.search(queries, k)
# re-print average overlap
print(np.mean([len(set(cpu_knns[i]).intersection(set(gpu_knns[i]))) for i in range(num_queries)]))
alternatively, if I don’t search for all 1000 queries at the same time, but search query-by-query:
gpu_dists = []
gpu_knns = []
for i in range(num_queries):
gpu_d, gpu_nn = gpu_index.search(queries[i:i+1], k)
gpu_dists.append(gpu_d)
gpu_knns.append(gpu_nn)
gpu_knns = np.concatenate(gpu_knns, axis=0)
then it’s ok, and the average overlap is 1018/1024.
Another solution, is to simply import torch before import faiss, which makes everything work OK even when searching for multiple queries in parallel.
Summary
If
faiss
isimport
ed beforetorch
- the returned KNNs from a GPU index are very bad. If we switch the order of imports, the returned KNNs are much more accurate.Platform
OS: Ubuntu
Faiss version: 1.7.1 Installed from:
pip install faiss-gpu
Faiss compilation options:Running on:
Interface:
Reproduction instructions
I load the same index that I always use for the KNN-LM paper (https://github.com/urvashik/knnlm), and then I convert it to a gpu-index, while keeping both the cpu and gpu indices.
Then I’m randomly creating 1000 numpy queries, and searching for the NNs in both indices. At the first search, the average overlap between CPU- and GPU- indices is ~270 on average (out of k=1024 nearest neighbors!), which is surprisingly low.
If I re-run
gpu_index.search(queries, k)
using the same queries and same k - the average overlap between the GPU index and the CPU index goes down to ~12 out of 1024. That is, not only that the overlap between CPU and GPU is low (270/1024), re-searching using the same queries gives totally worse neighbors overlap!alternatively, if I don’t search for all 1000 queries at the same time, but search query-by-query:
then it’s ok, and the average overlap is 1018/1024.
Another solution, is to simply
import torch
beforeimport faiss
, which makes everything work OK even when searching for multiple queries in parallel.What do you think? Thanks!