facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.
https://faiss.ai
MIT License
30.46k stars 3.56k forks source link

HNSW Flat Poor performance while Performance Testing #3774

Open IcanDoItL opened 3 weeks ago

IcanDoItL commented 3 weeks ago

I use IndexHNSWFlat make Index.

hnsw_index = faiss.IndexHNSWFlat(128, 32, faiss.METRIC_INNER_PRODUCT)
hnsw_index.hnsw.efConstruction = 64
hnsw_index = faiss.read_index('faiss.index', faiss.IO_FLAG_MMAP)
hnsw_index.hnsw.efSearch = 256
hnsw_index .search(query_embedding, 10)

when I use jmeter Stress Testing performance poor , has many context switch and system call image system call statistics : strace -c -p pid image

Summary

Platform

OS: Linux

Faiss version: faiss-cpu = 1.8.0

Installed from: anaconda

Faiss compilation options:

Running on:

Interface:

Reproduction instructions

IcanDoItL commented 3 weeks ago

https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors this is the multiprocessing code, call hnsw method,while print 'search' , see vmstat 1 1000. but call ivf method,It didn't appear

import faiss
import concurrent.futures

try:
    from faiss.contrib.datasets_fb import DatasetSIFT1M
except ImportError:
    from faiss.contrib.datasets import DatasetSIFT1M

k = 10
print("load data")
ds = DatasetSIFT1M()
xq = ds.get_queries()
xb = ds.get_database()
xt = ds.get_train()
nq, d = xq.shape

def faiss_search(index, xq):
    index.search(xq, k)

def ivf():
    print("Testing IVF Flat (baseline)")
    quantizer = faiss.IndexFlatL2(d)
    index = faiss.IndexIVFFlat(quantizer, d, 16384)
    index.cp.min_points_per_centroid = 5  # quiet warning

    # to see progress
    index.verbose = True

    print("training")
    index.train(xt)

    print("add")
    index.add(xb)

    print("search")
    index.nprobe = 256
    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
        for i in range(3):
            for q in xq:
                executor.submit(faiss_search, index, q.reshape(1, -1))

def hnsw():
    print("Testing HNSW Flat")
    index = faiss.IndexHNSWFlat(d, 32)
    index.hnsw.efConstruction = 40

    print("add")
    index.verbose = True
    index.add(xb)

    index.hnsw.search_bounded_queue = True
    index.hnsw.efSearch = 256
    print("search")

    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
        for i in range(3):
            for q in xq:
                executor.submit(faiss_search, index, q.reshape(1, -1))

if __name__ == '__main__':
    ivf()
    hnsw()

image

alexanderguzhva commented 3 weeks ago

@IcanDoItL faiss.IO_FLAG_MMAP

IcanDoItL commented 3 weeks ago

@alexanderguzhva I tried, it doesn't work.

hnsw_index = faiss.read_index('faiss.index', faiss.IO_FLAG_MMAP)
hnsw_index.hnsw.efSearch = 256
hnsw_index .search(query_embedding, 10)
IcanDoItL commented 3 weeks ago

the question like https://github.com/erikbern/ann-benchmarks/issues/47

mdouze commented 2 weeks ago

Normally one would parallelize the search using Faiss' internal threading. It is unavoidable that the python threading incurs an overhead. It may be because, irrespective of the number of queries, OpenMP spawns new threads which interact badly with the Python threads. So the code here:

https://github.com/facebookresearch/faiss/blob/main/faiss/IndexHNSW.cpp#L266

should be parallelized only if the number of queries is > 1 (and presumbaly something larger than 1)

mdouze commented 2 weeks ago

For IVF search, this check is performed here: https://github.com/facebookresearch/faiss/blob/main/faiss/IndexIVF.cpp#L447

IcanDoItL commented 2 weeks ago

Thanks, I see https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors. so I try use hnsw index instead of ivf index, but when I use hnsw index and jmeter Stress(Simulate users accessing simultaneously) Testing performance poor.

I’m planning to use it in a production environment. each search the number of queries is 1, but there will be multiple users accessing simultaneously. Could you offer some advice?