facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.
https://faiss.ai
MIT License
30.39k stars 3.55k forks source link

Index factory doesn't support nbits for LSH #3629

Closed irowberryFS closed 1 month ago

irowberryFS commented 1 month ago

Summary

The index factory doesn't support setting nbits for the LSH index.

Platform

Faiss version: 1.8.0

Installed from: conda

Running on:

Interface:

Reproduction instructions

Attempt to set nbits in some way for LSH index using index_factory()

index = faiss.index_factory(128, 'LSH16rt') Returns the following error. faiss/index_factory.cpp:880: could not parse index string LSHrt_16

Expecting behavior to something like this. index = faiss.IndexLSH(128, nbits)

ramilbakhshyiev commented 1 month ago

Thanks for reporting this. Confirmed. IndexLSH takes dimension and nbits separately while index_factory does not allow for nbits to be set and uses d as nbits.

Additional steps to repro:

import faiss
index = faiss.index_factory(128, "LSHrt")
print("d: %s, nbits: %s" % (index.d, index.nbits))
# outputs: d: 128, nbits: 128
import faiss
index = faiss.IndexLSH(128, 16)
print("d: %s, nbits: %s" % (index.d, index.nbits))
# outputs: d: 128, nbits: 16
mdouze commented 1 month ago

I would not label this as a bug. Most use cases of LSH do not need the dimension to be specified as a pre-transform reduces to the proper dimension and the LSH object just converts to binary. See eg. https://github.com/facebookresearch/faiss/wiki/Vector-codec-benchmarks#sift1m

asadoughi commented 1 month ago

Merged with https://github.com/facebookresearch/faiss/pull/3679