facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.
https://faiss.ai
MIT License
31.44k stars 3.64k forks source link

HNSWFlat Index with METRIC_BrayCurtis brings different results sometime when using CPU and GPU #3894

Closed qwevdb closed 1 month ago

qwevdb commented 1 month ago

Summary

Using HNSWFlat Index with METRIC_BrayCurtis to search brings different results sometime when running on CPU and GPU.

Platform

OS: Ubuntu 24.04 LTS

Faiss version: faiss-gpu 1.8.0

Installed from: Anaconda

Faiss compilation options:

Running on:

Interface:

Reproduction instructions

1. Run the Python script. 2. Search results are different ***sometime*** when using CPU and GPU in the Python script, although the configuration parameters are the same including index and metric type. ``` python import faiss import numpy as np import json DATA_INT_MAX = 1000 numpy_random = np.random.default_rng(0) dim = 374 M = 5619 metric = faiss.METRIC_BrayCurtis index1 = faiss.IndexHNSWFlat(dim, M, metric) index2 = faiss.IndexHNSWFlat(dim, M, metric) insert_data = numpy_random.uniform(-DATA_INT_MAX, DATA_INT_MAX, (2357, dim)) index1.train(insert_data) index1.add(insert_data) index2.train(insert_data) index2.add(insert_data) query_vector = numpy_random.uniform(-DATA_INT_MAX, DATA_INT_MAX, (1, dim)) k = 9 D1, I1 = index1.search(query_vector, k) print("CPU id:") print(I1) print("CPU distance:") print(D1) res = faiss.StandardGpuResources() gpu_index = faiss.index_cpu_to_gpu(res, 0, index2) D2, I2 = gpu_index.search(query_vector, k) print("GPU id:") print(I2) print("GPU distance:") print(D2) ``` When trying to run the script twice continuously, the first time result is different. ``` bash CPU id: [[2161 1007 990 514 529 366 1274 1123 1080]] CPU distance: [[0.80810606 0.8117319 0.8200436 0.82289296 0.8239249 0.82859623 0.8353435 0.8384627 0.84120756]] GPU id: [[2161 1007 990 529 366 1274 1123 1080 1505]] GPU distance: [[0.80810606 0.8117319 0.8200436 0.8239249 0.82859623 0.8353435 0.8384627 0.84120756 0.8450721 ]] ``` ``` bash CPU id: [[2161 1007 990 514 529 366 1274 1123 1080]] CPU distance: [[0.80810606 0.8117319 0.8200436 0.82289296 0.8239249 0.82859623 0.8353435 0.8384627 0.84120756]] GPU id: [[2161 1007 990 514 529 366 1274 1123 1080]] GPU distance: [[0.80810606 0.8117319 0.8200436 0.82289296 0.8239249 0.82859623 0.8353435 0.8384627 0.84120756]] ```
asadoughi commented 1 month ago

HNSW is not implemented on GPU.