Official software repository of S. Bruch, F. M. Nardini, C. Rulli, and R. Venturini, "Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations". Long Paper @ ACM SIGIR 2024 (Best Paper Runner-up).
MIT License
43
stars
1
forks
source link
Python Implementation and Encapsulation for Easy Similarity Search #1
First of all, congratulations on your paper! I found it insightful and truly enjoyed reading through it.
I was wondering if there is a Python version of your approach. Specifically, for approximate nearest neighbors search Python libraries such as NMSLIB and Faiss, the search can be easily performed by:
import nmslib
import numpy
# create a random matrix to index simulating some vectors
data = numpy.random.randn(10000, 100).astype(numpy.float32)
# initialize a new index, using a HNSW index on Cosine Similarity
index = nmslib.init(method='hnsw', space='cosinesimil')
index.addDataPointBatch(data)
index.createIndex({'post': 2}, print_progress=True)
# query for the nearest neighbours of the first datapoint
ids, distances = index.knnQuery(data[0], k=10)
# get all nearest neighbours for all the datapoint
# using a pool of 4 threads to compute
neighbours = index.knnQueryBatch(data, k=10, num_threads=4)
I believe it would be great if your approach could be encapsulated similarly for ease of use. This would provide a seamless integration for those looking to use your method in Python, especially when dealing with large-scale similarity searches.
Hello,
First of all, congratulations on your paper! I found it insightful and truly enjoyed reading through it.
I was wondering if there is a Python version of your approach. Specifically, for approximate nearest neighbors search Python libraries such as NMSLIB and Faiss, the search can be easily performed by:
I believe it would be great if your approach could be encapsulated similarly for ease of use. This would provide a seamless integration for those looking to use your method in Python, especially when dealing with large-scale similarity searches.