facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.
https://faiss.ai
MIT License
31.71k stars 3.66k forks source link

If faiss is imported before torch, GPU results are much worse #2126

Open urialon opened 3 years ago

urialon commented 3 years ago

Summary

If faiss is imported before torch - the returned KNNs from a GPU index are very bad. If we switch the order of imports, the returned KNNs are much more accurate.

Platform

OS: Ubuntu

Faiss version: 1.7.1 Installed from: pip install faiss-gpu Faiss compilation options:

Running on:

Interface:

Reproduction instructions

I load the same index that I always use for the KNN-LM paper (https://github.com/urvashik/knnlm), and then I convert it to a gpu-index, while keeping both the cpu and gpu indices.

Then I’m randomly creating 1000 numpy queries, and searching for the NNs in both indices. At the first search, the average overlap between CPU- and GPU- indices is ~270 on average (out of k=1024 nearest neighbors!), which is surprisingly low.

If I re-run gpu_index.search(queries, k) using the same queries and same k - the average overlap between the GPU index and the CPU index goes down to ~12 out of 1024. That is, not only that the overlap between CPU and GPU is low (270/1024), re-searching using the same queries gives totally worse neighbors overlap!

import faiss
import torch
import numpy as np

num_queries = 1000
k = 1024
indexfile = 'my_knn.index'

cpu_index = faiss.read_index(indexfile, faiss.IO_FLAG_ONDISK_SAME_DIR)
cpu_index.nprobe = 32
co = faiss.GpuClonerOptions()
co.useFloat16 = True
gpu_index = faiss.index_cpu_to_gpu(faiss.StandardGpuResources(), 0, cpu_index, co)
gpu_index.nprobe = 32

queries = np.random.random((num_queries, 1024)).astype(np.float32)
cpu_dists, cpu_knns = cpu_index.search(queries, k)
gpu_dists, gpu_knns = gpu_index.search(queries, k)

# print average overlap
print(np.mean([len(set(cpu_knns[i]).intersection(set(gpu_knns[i]))) for i in range(num_queries)]))

# Search again in the GPU index
gpu_dists, gpu_knns = gpu_index.search(queries, k)
# re-print average overlap
print(np.mean([len(set(cpu_knns[i]).intersection(set(gpu_knns[i]))) for i in range(num_queries)]))

alternatively, if I don’t search for all 1000 queries at the same time, but search query-by-query:

gpu_dists = []
gpu_knns = []
for i in range(num_queries):
    gpu_d, gpu_nn = gpu_index.search(queries[i:i+1], k)
    gpu_dists.append(gpu_d)
    gpu_knns.append(gpu_nn)
gpu_knns = np.concatenate(gpu_knns, axis=0)

then it’s ok, and the average overlap is 1018/1024.

Another solution, is to simply import torch before import faiss, which makes everything work OK even when searching for multiple queries in parallel.

What do you think? Thanks!

frankxu2004 commented 2 years ago

I also encountered this bug and I wonder if there's an update on it?

namespace-Pt commented 1 year ago

+1