facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.
https://faiss.ai
MIT License
30.57k stars 3.57k forks source link

difference between ivf4096 and ivf16384 #3129

Closed jamesrobertwilliams closed 2 months ago

jamesrobertwilliams commented 10 months ago

Hi

I would like to know what is the difference between ivf4096 and ivf 16384. I have vectors of 30mx386 in size and I would like to know which index would be most appropriate for it. Also, for nprobe I am using 2048 for extracting 100 nearest neighbours. Does this setup look reasonable?

to be clear, I do the following incode:

    d=384
    gpu_index = faiss.index_factory(d, "IVF16384,Flat") # not on gpu pretending it is#
    gpu_index.train(trainset)
    gpu_index.add(trainset)
    print(gpu_index.ntotal)
    faiss.write_index(gpu_index, "index.bin")
    gpu_index_loaded = faiss.read_index("index.bin")  # index2 is identical to index

    # Search
    gpu_index_loaded.nprobe = 2048  # Runtime param. The number of cells that are visited for search.
    topk = 100
    dists, ids = gpu_index_loaded.search(x=queryset, k=topk)
    print(ids[:5])
mdouze commented 10 months ago

See https://github.com/facebookresearch/faiss/wiki/Faster-search